巴西专利BR112012002823B1 COMPUTER IMPLEMENTED METHOD OF PROCESSING A VISUAL QUERY, SERVER SYSTEM, AND, COMPUTER-READABLE NON-

专利PDF首页>>巴西专利

专利附录

专利说明

权利要求

类似技术

同族专利

引用文献

法律状态

优先权

专利摘要:
computer-implemented method of processing a visual query, server system, and computer-readable, non-transient storage media, a facial recognition search system that identifies one or more likely names (or other personal identifiers) corresponding to the ) facial image(s) in a consultation, as follows. after receiving the visual query with one or more facial images, the system identifies images that potentially match the respective facial image according to visual similarity criteria. then one or more people associated with the potential images are identified. for each identified person, person-specific data comprising measures of social connectivity to the requestor is retrieved from a plurality of applications, such as communications applications, social networking applications, calendaring applications, and collaborative applications. then, an ordered list of people is generated by ranking the identified people according to at least measures of visual similarity between their facial image and potential image matches and measures of social connection. finally, at least one person identifier from the list is sent to the requester.
公开号:BR112012002823B1
申请号:R112012002823-5
申请日:2010-08-06
公开日:2021-06-22
发明作者:David Petrou；Andrew Rabinovich；Hartwig Adam
申请人:Google Llc；
IPC主号:

专利说明:

FIELD OF THE INVENTION
The modalities disclosed relate, in general, to the identification of one or more people who potentially correspond to a face in an image query by using information from the social network and information obtained from other figures of the person(s). ) .10 identified to facilitate identification of the best ■ match(s) of person(s). FUNDAMENTALS
Text-based or term-based search, in which a user enters a word or phrase into a search engine and receives a variety of results, is a tool used to search. However - term-based queries require a user to be able to enter a relevant term. Sometimes a user may want to know information about an image. For example, a user might want to know the name of a person in a photograph. A person may also want to know other information, such as contact information, for a person in a photograph. In this way, a system that can receive a facial image query and provide a variety of search results related to a person identified in the facial image query would be desirable. SUMMARY
Under some embodiments, a method of processing a visual query that includes a computer-implemented facial image is performed on a server system with one or more processors and memory that stores one or more programs for execution by the one or more processors. The method includes the process outlined below. A visual inquiry comprising one or more facial images that include a respective facial image is received from a requester. Potential image matches that potentially match the respective facial image are identified according to visual similarity criteria. Potential image matches comprise images from one or more image sources identified according to data regarding the requester. One or more people associated with the potential image matches are identified. For each identified person, person-specific data comprising social connectedness measures of social connectivity to the requestor obtained from a plurality of applications is retrieved. The plurality of applications is selected from the group consisting of communication applications, social network applications, calendar applications and collaborative applications. An ordered list of people is generated by ranking the one or more people identified according to one or more measures of visual similarity between the respective facial image and potential image matches, and also according to ranking information that comprises at least the measures of social connection. Then at least one person identifier from the ordered list is sent to the requester. A method like this can also include program instructions to execute the additional options discussed in the following sections.
Under some embodiments, a server system is provided for processing a visual query that includes a facial image. The server system includes one or more processors for executing programs and memory that stores one or more programs to be executed by the one or more processors. The one or more programs include instructions for the process outlined below. A visual inquiry comprising one or more facial images that include a respective facial image is received from a requester. Potentials <image matches that potentially match the respective facial image are identified according to visual similarity criteria. Potential image matches comprise images from one or more image sources identified according to data regarding the requester. One or more people associated with potential image matches are identified. For each identified person, person-specific data comprising social connectedness measures of social connectivity to the requestor obtained from a plurality of applications is retrieved. The plurality of applications is selected from the group consisting of communication applications, social network applications, calendar applications and collaborative applications. An ordered list of people is generated by ranking the one or more people identified according to one or more measures of visual similarity between the respective facial image and potential image matches, and also according to ranking information that comprises the minus the measures of social connection. Then, at least one person identifier from the ordered list is sent to the requester. Such a system may also include program instructions to perform the additional options discussed in the following sections.
Under some embodiments, a non-transient computer readable storage media for processing a visual query that includes a facial image is provided. The computer readable storage media stores one or more programs 25 configured for execution by a computer, the one or more programs comprising instructions for performing the following. A visual inquiry comprising one or more facial images that include a respective facial image is received from a requester. Potential image matches that potentially match the respective facial image are identified according to visual similarity criteria. Potential image matches comprise images from one or more image sources identified according to data regarding the requester. One or more people associated with the potential image matches are identified. For each person identified, person-specific data comprising social connectivity measures of social connectivity to the requestor obtained from a plurality of applications are retrieved. The plurality of applications is selected from the group consisting of communication applications, 10 social network applications, calendar applications and collaborative applications.
An ordered list of people is generated by ranking the one or more people identified according to one or more measures of visual similarity between the respective facial image and potential image matches, and also according to ranking information comprising by minus the measures of social connection. Then, at least one person identifier from the ordered list is sent to the requester. Computer-readable storage media such as this may also include program instructions for performing the additional options discussed in the following sections. BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 is a block diagram illustrating a computer network that includes a visual query server system.
Figure 2 is a flowchart illustrating the process for responding to a visual query according to some modalities.
Figure 3 is a flowchart illustrating the process for responding to a visual query with an interactive results document according to some modalities.
Figure 4 is a flowchart illustrating communications between a client and a visual query server system according to some embodiments.
Figure 5 is a block diagram illustrating a client system according to some embodiments.
Figure 6 is a block diagram illustrating a visual query processing server system in initial interface according to some embodiments.
Figure 7 is a block diagram illustrating a generic system of parallel search systems used to process a visual query according to some modalities.
Figure 8 is a block diagram illustrating an OCR search system used to process a visual query according to some embodiments.
Figure 9 is a block diagram illustrating a facial recognition search system used to process a visual query according to some modalities.
Figure 10 is a block diagram illustrating an image search system by terms used to process a visual query according to some embodiments.
Figure 11 illustrates a client system with a screenshot of an exemplary visual query according to some modalities.
Figures 12A and 12B each illustrate a client system with a screen capture of an interactive results document with confirmation boxes according to some embodiments.
Figure 13 illustrates a client system with a screenshot of an interactive results document that is type-coded according to some modalities.
Figure 14 illustrates a client system with a screenshot of an interactive results document with labels according to some modalities.
Figure 15 illustrates a screenshot of an interactive results document and visual query displayed concurrently with a list of results according to some modalities.
Figures 16A - 16B are flowcharts illustrating the process of responding to a visual inquiry that includes a facial image in some embodiments.
Figure 17 is a flowchart illustrating various factors and characteristics used in generating an ordered list of people who potentially match a facial image in a visual query according to some modalities.
Fig. 18A is a block diagram illustrating a part of the data structure of a facial image database used by a facial recognition search system in accordance with some embodiments. Figure 18B illustrates relationships between people through a plurality of applications, such as social networking and communication applications, according to some embodiments. Figure 18C is a block diagram illustrating some image-derived features according to some embodiments. Like reference numerals refer to corresponding parts throughout the drawings. DESCRIPTION OF MODALITIES
Reference will now be made in detail to the modalities, examples of which are illustrated in the attached drawings. In the following detailed description, numerous specific details are presented in order to provide a thorough understanding of the present invention. However, it will be apparent to those skilled in the art that the present invention can be practiced without these specific details. In other cases, well-known methods, procedures, components, circuits, and networks have not been described in detail so as not to unnecessarily obscure aspects of the modalities.
It is also understood that although the terms first, second, etc. can be used here to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first contact can be called a second contact and, similarly, a second contact can be called a first contact, without departing from the scope of the present invention. Both the first contact and the second contact are contacts, but they are not the same contact.
Here, the terminology used in describing the invention is for the purpose of describing particular embodiments only and is not intended to limit the invention. As used in the description of the invention and the appended claims, the singular forms "a", "an", "the" and "a" are intended to also include the plural forms, unless the context clearly indicates otherwise. It is also understood that the term "and/or", as used herein, refers to and encompasses any and all possible combinations of one or more of the associated listed items. It is further understood that the terms "comprises" and/or "comprising", when used in this specification, specify the presence of declared features, integers, steps, operations, elements and/or components, but do not preclude the presence or addition of one or more other resources, integers, steps, operations, elements, components and/or groups thereof.
As used herein, the term "if" can be interpreted to mean "when" or "by" or "in response to determination" or "in response to detection", depending on the context. Similarly, the phrases "if determined" or "if (a stated condition or event) is detected" can be interpreted to mean "upon determination" or "in response to determination" or "upon detection (the stated condition or event) ” or “in response to detection (the stated condition or event)”, depending on the context.
Figure 1 is a block diagram illustrating a computer network that includes a visual query server system according to some embodiments. Computer network 100 includes one or more client systems 102 and a visual query server system 106. One or more communication networks 104 interconnect these components. Communication networks 104 can be any of a variety of networks, including local area networks (LAN), wide area networks (WAN), wireless networks, wired networks, the Internet, or a combination of such networks.
The client system 102 includes a client application 108 that is executed by the client system to receive a visual inquiry (e.g., visual inquiry 1102 of Fig. 11). A visual query is an image that is submitted as a query to a search engine or search engine. Examples of visual queries include, without limitation, photographs, scanned documents and images, and drawings. In some embodiments, client application 108 is selected from the set consisting of a search application, a search engine plug-in for a browser application, and a search engine extension for a browser application. In some embodiments, the client application 108 is an “omnivorous” search box, which allows a user to drag and drop any image format into the search box to be used as the visual query. .
A client system 102 sends queries to and receives data from the visual query server system 106. The client system 102 can be any computer or other device that can communicate with the visual query server system 106. Examples include, without limitation, desktop computers. desktop and laptops, mainframe computers, server computers, mobile devices such as cell phones and personal digital assistants, network terminals and integrated receivers/decoders.
The visual query server system 106 includes an initial interface visual query processing server 110. The initial interface server 110 receives a visual query from the client 102 and sends the visual query to a plurality of parallel query systems 112 for simultaneous processing. Each of the search systems 112 implements a distinct visual query search process and accesses its corresponding databases 114 as needed to process the visual query by its distinct search process. For example, a face recognition search system 112-A will access a facial image database 114-A to search for facial matches in relation to the image query. As will be explained in more detail with reference to Figure 9, if the visual query contains a face, the 112-A facial recognition search system will retrieve one or more search results (eg names, matching faces, etc.). ) from the facial image database 114-A. In another example, the optical character recognition (OCR) search system 112-B converts any text recognizable in the visual query to text for return as one or more search results. In the Optical Character Recognition (OCR) Search System 112-B, an OCR database 114-B can be accessed to recognize particular text fonts or patterns, as explained in more detail with reference to Figure 8.
Any number of 112 parallel search engines can be used. Some examples include a 112-A facial recognition search system, an OCR 112-B search system, an image search system for 112-C terms (which can recognize an object or a category of object), a search engine. product recognition search (which can be configured to recognize 2D images such as book covers and CDs, and can also be configured to recognize 3D images such as furniture), barcode recognition search system (which recognizes 1D and 2D style barcodes), a named entity recognition search system, landmark recognition (which can be configured to recognize famous landmarks in particular, such as the Eiffel Tower, and can also be configured to recognize a body of specific images, such as advertising boards), location recognition aided by geolocation information provided by a GPS receiver in client system 102 or cellular network, a color recognition search system and a similar image search system (which searches for and identifies similar images to a visual query). Additional search engines can be added as additional parallel search engines represented in Figure 1 by system 112-N. All search engines, except the OCR search engine, are here collectively defined as search engines that perform an image matching process. All search engines that include the OCR search engine are collectively referred to as image query search engines. In some embodiments, the visual query server system 106 includes a facial recognition search system 112-A, an OCR search system 112-B, and at least one other image query search system 112.
Each of the parallel search systems 112 individually processes the visual search query and returns its results to the initial interface server system 110. In some embodiments, the initial interface server 100 may perform one or more analyzes on the search results, such as one or more of: aggregating the results into a composite document, choosing a subset of the results to display and ranking the results, as will be explained in more detail with reference to Figure 6. The initial interface server 110 communicates the search results to the client system 102. The client system 102 presents the one or more search results to the user. Results can be presented on a screen, by an audio speaker or any other device used to communicate information to a user. The user can interact with search results in a variety of ways. In some embodiments, the user's selections, annotations and other interactions with the search results are transmitted to the visual query server system 106 and recorded along with the visual query in a query and annotation database 116. Information in the database Query and annotation can be used to improve visual query results. In some embodiments, information from the lookup and annotation database 116 is periodically transferred to parallel search systems 112, which incorporate all relevant pieces of information into their respective individual databases 114.
Optionally, computer network 100 includes a term query server system 118 to perform searches in response to term queries. A term query is a query that contains one or more terms, as opposed to a visual query that contains an image. The term query server system 118 can be used to generate search results that supplement the information produced by the various search engines in the visual query server system 106. The results retrieved from the term query server system 118 may include any Format. The term lookup server system 118 may include textual documents, images, video, etc. Although the term query server system 118 is shown as a separate system in Figure 1, optionally, the visual query server system 106 may include a term query server system 118.
Additional information on the operation of the visual query server system 106 is provided below in relation to the flowcharts of Figures 2-4.
Figure 2 is a flowchart illustrating a method of the visual query server system for responding to a visual query in accordance with certain embodiments of the invention. Each of the operations shown in Figure 2 may correspond to instructions stored in computer memory or computer-readable storage media.
The visual query server system receives a visual query from a client system (202). The client system, for example, can be a desktop computing device, a mobile device, or another similar device (204), as explained in relation to Figure 1. An exemplary visual query on an exemplary client system is shown in Figure 11.
A visual query is an image document of any suitable format. For example, the visual query can be a photograph, a screen capture, a digitized image, or a frame or multi-frame sequence of video (206). In some modalities, the visual query is a drawing produced by a content authoring program (736, figure 5). As such, in some modalities, the user “draws” the visual query, while in other modalities, the user digitizes or photographs the visual query. Some visual queries are created using an image-generating application such as Acrobat, a photo editing program, a drawing program, or an image editing program. For example, a visual query might come from a user who takes a photograph of their friend on their cell phone and then submits the photograph as the visual query to the server system. Visual query can also come from a user who scans a page from a magazine or takes a screenshot of a web page on a desktop computer and then submits the scan or screenshot as the visual query to the server system. In some embodiments, the visual query is submitted to the server system 106 via a browser application browser extension, via a browser application plug-in, or via a browser application run by client system 102. Visual queries can also be submitted by other application programs (executed by a client system) that support or generate images that can be transmitted to a server remotely located by the client system.
Visual query can be a combination of textual and non-textual elements (208). For example, a query might be a scan of a magazine page that contains images and text, such as a person standing near a traffic sign. A visual query can include an image of a person's face, whether taken by a camera built into the client system or a document scanned or otherwise received by the client system. A visual query can also be a scan of a document that contains only text. The visual query can also be an image of a number of different subjects, such as several birds in a forest, a person and an object (eg car, park bench, etc.), a person and an animal (eg animal pet, farm animal, butterfly, etc.). Visual queries can have two or more distinct elements. For example, a visual query might include a barcode and an image of a product or product name on a product packaging. For example, the visual query might be a picture of a book cover that includes the book's title, cover art, and a barcode. In some cases, a visual query will produce two or more distinct search results corresponding to different parts of the visual query, as discussed in more detail below.
The server system processes the visual query as follows. The initial interface server system sends the visual query to a plurality of parallel search systems for simultaneous processing (210). Each search system implements a distinct visual query search process, that is, an individual search system processes the visual query by its own processing scheme.
In some embodiments, one of the search systems to which the visual query is sent for processing is an optical character recognition (OCR) search system. In some modalities, one of the search systems to which the visual query is sent for processing is a facial recognition search system. In some embodiments, the plurality of search systems that perform distinct visual query search processes include at least: optical character recognition (OCR), facial recognition, and an image query process other than OCR and facial recognition (212 ). The other image query process is selected from a set of processes that include, but are not limited to, product recognition, barcode recognition, object or object category recognition, named entity recognition, and color recognition ( 212).
In some modalities, named entity recognition occurs as a post-process of the OCR search engine, in which the result of the OCR text is analyzed in relation to famous people, places and objects and the like, and then the terms identified as entities names are fetched in the query server system by term (118, figure 1). In other modalities, images of famous landmarks, logos, people, album covers, trademarks, etc. are recognized by an image search engine by terms. In other embodiments, a distinguished named entity image query process separate from the term image search engine is used. The object or object category recognition system recognizes generic result types such as “car”. In some modalities, this system also recognizes product brands, particular product models and the like, and provides more specific descriptions, such as “Porsche”. Some of the search engines may be special user-specific search engines. For example, particular versions of color recognition and facial recognition may be special search systems used by the blind.
The initial interface server system receives results from the parallel search systems (214). In some modalities, the results are accompanied by a search score. For some visual queries, some of the search engines will not find relevant results. For example, if the visual query was a picture of a flower, the facial recognition search system and the barcode search system will not find any relevant results. In some modalities, if no relevant results are found, a null or zero search score is received from this search engine (216). In some embodiments, if the initial interface server does not receive a result from a search engine after a predefined period of time (eg 0.2, 0.5, 1, 2 or 5 seconds), it will process the results received as if this timed server had produced a null search score and process the results received from the other search systems.
Optionally, when at least two of the received search results satisfy pre-defined criteria, they are ranked (218). In some modalities, one of the predefined criteria excludes empty results. A predefined criterion is that the results are not empty. In some modalities, one of the predefined criteria excludes results with a numerical score (for example, for a relevance factor) that falls below a predefined minimum score. Optionally, the plurality of search results is filtered (220). In some modalities, results are filtered only if the total number of results exceeds a pre-defined threshold. In some modalities, all results are ranked, but results that fall below a pre-defined minimum score are excluded. For some visual queries, the content of the results is filtered. For example, if some of the results contain private information or protected personal information, these results are filtered out.
Optionally, the visual query server system creates a composite search result (222). One such modality is when more than one search result system is embedded in an interactive results document, as explained in relation to Figure 3. The term query server system (118, Figure 1) can increase the results from one of the parallel search systems with results from a search by term, where the additional results are either links to documents or information sources or text and/or images that contain additional information that may be relevant to the visual query. So, for example, the composite search result might contain an OCR result and a link to an entity named in the OCR document (224).
In some embodiments, the OCR search system (112-B, Figure 1) or the initial interface visual query processing server (110, Figure 1) recognizes likely relevant words in the text. For example, they might recognize named entities, such as famous people or places. The named entities are submitted as query terms to the query-by-term server system (118, figure 1). In some embodiments, the term query results produced by the term query server system are incorporated into the visual query result as a “link”. In some embodiments, query results by term are resumed as separate links. For example, if a picture on a book cover is the visual query, an object recognition search system is likely to produce a high score for the book. As such, a term query for the book title will be run on the term query server system 118 and the term query results are retrieved along with the results of the visual query. In some embodiments, query results by term are presented in a labeled group to distinguish them from visual query results. Results can be searched individually or a search can be performed using all named entities recognized in the search query to produce additional particularly relevant search results. For example, if the visual query is a digitized travel guide about Paris, the retrieved result might include links to the query server system by term 118 to initiate a search on a query by term “Notre Dame”. Similarly, composite search results include results from text searches for well-known famous images. For example, in the same travel guide, dynamic links to query results by term for famous destinations shown as figures in the guide, such as “Eiffel Tower” and “Louvre”, can also be shown (even if the terms “Eiffel Tower” and “Louvre” did not appear in the guide itself).
Then, the visual query server system sends at least one result to the client system (226). Typically, if the visual query processing server receives a plurality of search results from at least some of the plurality of search systems, then it will send at least one of the plurality of search results to the client system. For some visual queries, only a search engine will return relevant results. For example, in a visual query that contains only a text image, only the results from the OCR server might be relevant. For some visual queries, only a result from a search engine may be relevant. For example, only the product related to a scanned barcode can be relevant. In these cases, the initial interface visual processing server will only retrieve the relevant search result(s). For some visual queries, a plurality of search results are sent to the client system and the plurality of search results include search results from more than one of the parallel search systems (228). This can occur when more than one distinct image is in the visual query. For example, if the visual query was a picture of a person riding a horse, the person's facial recognition results might be displayed along with object identification results for the horse. In some embodiments, all results for a particular query by the image search system are grouped and presented together. For example, the first N facial recognition results are displayed under a topic “face recognition results” and the first N object recognition results are displayed together under a topic “object recognition results”. Alternatively, as discussed below, search results from a particular image search system can be grouped by image region. For example, if the visual query includes two faces, both of which produce facial recognition results, the results for each face will be presented as a distinct group. For some visual queries (for example, a visual query that includes an image of both text and one or more objects), search results may include both OCR results and one or more image matching results (230) .
In some modalities, the user may wish to learn more about a particular search result. For example, if the visual query was a picture of a dolphin and the "image by terms" search engine returns the following terms "water", "dolphin", "blue" and "Flipper", the user may wish to perform a search by term by text-based query on “Flipper”. When the user wishes to perform a search on a query by term (for example, as indicated by the user by clicking or otherwise selecting a corresponding link in the search results), the query server system by term (118, figure 1) is accessed, and the search on the selected term(s) is performed. The corresponding search results by term are displayed in the client system either separately or together with the results of the visual query (232). In some embodiments, the initial interface visual query processing server (110, Figure 1) automatically chooses (that is, without receiving any user commands other than the initial visual query) one or more top potential text results for the query. visual, executes these text results on the term query server system 118 and then returns these term query results along with the result of the visual query to the client system as a part of sending at least one search result to the system. customer (232). In the above example, if “Flipper” was the first term result for a dolphin visual query figure, the home interface server runs a term query on “Flipper” and retrieves these query results by term along with the results from visual query to the client system. This mode, in which a term result that is considered likely to be selected by the user is automatically executed before sending the search results of the visual query to the user, saves the user's time. In some embodiments, these results are displayed as a composite search result (222) as explained above. In other embodiments, results are part of a search result list rather than or in addition to a composite search result.
Figure 3 is a flowchart illustrating the process for responding to a visual query with an interactive results document. The first three operations (202, 210, 214) are described above in relation to Figure 2. From the search results that are received from the parallel search systems (214), an interactive results document is created (302).
The creation of the interactive results document (302) will now be described in detail. For some visual queries, the interactive results document includes one or more visual identifiers of the respective subparts of the visual query. Each visual identifier has at least one user-selectable link to at least one of the search results. A visual identifier identifies a respective subpart of the visual query. For some visual queries, the interactive results document has only a visual identifier with a user-selectable link to one or more results. In some embodiments, a respective user-selectable link to one or more of the search results has an activation region, and the activation region corresponds to the subpart of the visual query that is associated with a corresponding visual identifier.
In some embodiments, the visual identifier is a confinement box (304). In some embodiments, the confinement box encloses a sub-part of the visual query, as shown in Figure 12A. The confinement box does not need to be a square or rectangular shaped box, but can be any type of shape, including circular, oval, conformal (for example, for an object, entity or region of the visual query), irregular or any other shape. shown in Figure 12B. For some visual queries, the containment box delineates the boundary of an identifiable entity in a subpart of the visual query (306). In some embodiments, each containment box includes a user-selectable link to one or more search results, where the user-selectable link has an activation region corresponding to a sub-part of the visual query surrounded by the containment box. When the space inside the confinement box (the user-selectable binding activation region) is selected by the user, search results that match the image in the outlined subpart are resumed.
In some embodiments, the visual identifier is a label (307), as shown in Figure 14. In some embodiments, the label includes at least one term associated with the image in the respective subpart of the visual query. Each label is formatted for presentation in the interactive results document in or near its respective subpart. In some embodiments, labels are color coded.
In some embodiments, each of the respective visual identifiers is formatted for presentation in a visually distinctive manner according to an entity type recognized in the respective subpart of the visual query. For example, as shown in Figure 13, each of the confinement boxes around a product, a person, a trademark, and the two textual areas are presented with distinct hatch patterns, representing differently colored transparent confinement boxes. In some modalities, visual tags are formatted for presentation in visually distinctive ways, such as overlay color, overlay pattern, label background color, label background, label font color, and border color.
In some embodiments, the user-selectable link in the interactive results document is a link to a document or object that contains one or more results related to the corresponding subpart of the visual query (308). In some embodiments, at least one search result includes data related to the corresponding subpart of the visual query. As such, when the user selects the selectable link associated with the respective subpart, the user is directed to search results corresponding to the entity recognized in the respective subpart of the visual query.
For example, if a visual query was a photograph of a barcode, there may be parts of the photograph that are irrelevant parts of the package on which the barcode was affixed. The interactive results document can include a containment box around the barcode only. When the user selects the inside of the outlined barcode confinement box, the barcode search result is displayed. The barcode search result may include a result, the product name corresponding to that barcode, or the barcode results may include various results, such as a variety of locations where this product can be purchased, analyzed , etc.
In some embodiments, when the subpart of the visual query corresponding to a respective visual identifier contains text that comprises one or more terms, the search results corresponding to the respective visual identifier include results of a query search by term on at least one of the terms in the text. In some embodiments, when the subpart of the visual query corresponding to a respective visual identifier contains the face of a person for which at least one match (ie, search result) is found that satisfies pre-reliability (or other) criteria. defined, search results corresponding to the respective visual identifier include one or more of: name, identifier, contact information, account information, address information, current location of a related mobile device associated with the person whose face is contained in the subpart selectable, other images of the person whose face is contained in the selectable subpart, and potential image matches for the person's face. In some modalities, when the subpart of the visual query corresponding to a respective visual identifier contains a product for which at least one match (ie, search result) was found that satisfies predefined reliability (or other) criteria, the Search results matching the respective visual identifier include one or more of: product information, a product review, an option to initiate a product purchase, an option to initiate a product offer, a list of similar products, and a list of related products.
Optionally, a respective user-selectable binding in the interactive results document includes anchor text, which is displayed in the document without having to activate the binding. Anchor text provides information, such as a key word or term, related to information obtained when the link is activated. Anchor text can be displayed as part of the label (307) or in a part of a containment box (304), or as additional information displayed when a user hovers a cursor over a user-selectable link for a pre-set period of time. determined, such as 1 second.
Optionally, a respective user-selectable link in the interactive results document is a link to a search engine to search for information or documents corresponding to a text-based query (sometimes here called a term query). Activating the link causes the search engine to run, where the query and the search engine are specified by the link (for example, the search engine is specified by a URL in the link and the search query is based on text is specified by a link URL parameter), with results returned to the client system. Optionally, the link in this example can include anchor text that specifies the text or terms in the search query.
In some embodiments, the interactive results document produced in response to a visual query may include a plurality of links that match results from the same search system. For example, a visual query might be an image or picture of a group of people. The interactive results document can include containment boxes around each person that, when activated, retrieve results from the facial recognition search system for each face in the group. For some visual queries, a plurality of links in the interactive results document matches search results from more than one search system (310). For example, if a figure of a person and a dog were submitted as the visual query, containment boxes in the interactive results document might delineate the person and dog separately. When the person (in the interactive results document) is selected, search results from the facial recognition search system are resumed, and when the dog (in the interactive results document) is selected, results from the image search system by terms are resumed. For some visual queries, the interactive results document contains an OCR result and an image match result (312). For example, if a picture of a person standing near a sign was submitted as a visual query, the interactive results document might include visual identifiers for the person and for the text on the sign. Similarly, if a scan of a magazine was used as the visual query, the interactive results document may include visual identifiers for photographs or trademarks in advertisements on the page, as well as a visual identifier for the text of an article also on this page.
Once the interactive results document has been created, it is sent to the client system (314). In some modalities, the interactive results document (eg, document 1200, figure 15) is sent together with a list of search results from one or more parallel search systems, as discussed above in relation to figure 2. In In some embodiments, the interactive results document is displayed in the client system above or otherwise adjacent to a list of search results from one or more parallel search systems (315), as shown in Figure 15.
Optionally, the user will interact with the results document by selecting a visual identifier in the results document. The server system receives, from the client system, information regarding the user's selection of a visual identifier in the interactive results document (316). As discussed above, in some modalities, binding is activated by selecting an activation region within a confinement box. In other modalities, the link is activated by a selection, by the user, of a visual identifier of a subpart of the visual query that is not a containment box. In some embodiments, the linked visual identifier is a quick button, a label located near the subpart, an underlined word in the text, or other representation of an object or subject in the visual query.
In modalities where the search results list is presented with the interactive results document (315), when the user selects a user selectable link (316), the search result in the search results list corresponding to the selected link is identified. In some modes, the cursor will automatically jump or move to the first result corresponding to the selected link. In some embodiments where the client screen 102 is too small to display both the interactive results document and the entire search results list, selecting a link in the interactive results document causes the search results list to scroll or jump to display at least one first result corresponding to the selected link. In some other modalities, in response to the user's selection of a link in the interactive results document, the result list is reordered so that the first result matching the link is displayed at the top of the results list.
In some embodiments, when the user selects the user selectable link (316), the visual query server system sends at least a subset of the results related to a corresponding subpart of the visual query to the client for display to the user (318). In some modalities, the user can select multiple visual identifiers concurrently and will receive a subset of the results for all selected visual identifiers at the same time. In other embodiments, search results corresponding to the 10 user selectable links are preloaded onto the client prior to the user selecting any of the user selectable links to provide search results to the user virtually instantaneously in response to the selection by the user of one or more links in the interactive results document.
Figure 4 is a flowchart illustrating communications between a client and a visual query server system. Client 102 receives a visual query from a user / inquirer / requester (402). In some modalities, visual queries can only be accepted from users who have subscribed or “joined” to the visual query system. In some modalities, searches for facial recognition matches are performed only for users who have subscribed to the facial recognition visual query system, while other types of visual queries are performed by anyone, regardless of whether they "joined" to the recognition part. facial.
As stated, the visual query format can take many forms. The visual query will likely contain one or more subjects located in subparts of the visual query document. For some visual queries, the client system 102 performs type recognition preprocessing on the visual query (404). In some embodiments, the client system 102 searches for particularly recognizable patterns in this pre-processing system. For example, for some visual queries, the customer can recognize colors. For some visual queries, the client may recognize that a particular subpart is likely to contain text (because this area is made up of small dark characters surrounded by light space, etc.). The client can contain any number of type recognition preprocesses or type recognition modules. In some embodiments, the customer will have a type recognition module (Barcode Recognition 406) to recognize bar codes. It can do this by recognizing the distinctive striped pattern in a rectangular area. In some embodiments, the client will have a type recognition module (face detection 408) to recognize that a particular subject or subpart of the visual query is likely to contain a face.
In some modalities, the recognized “type” is returned to the user for verification. For example, client system 102 may return a message that states “a barcode was found in your visual query, are you interested in receiving barcode query results ”. In some modalities, the message can still indicate the subpart of the visual query where the type was found. In some embodiments, this presentation is similar to the interactive results document discussed in relation to Figure 3. For example, it can outline a subpart of the visual query and indicate that the subpart is likely to contain a face, and ask the user if it is. interested in receiving facial recognition results.
After the client 102 performs the optional pre-processing of the visual query, the client sends the visual query to the visual query server system 106, specifically, to the initial interface visual query processing server 110. -processing produced relevant results, that is, if one of the type recognition modules produced results above a certain threshold, indicating that the query or a subpart of the query is likely to be of a particular type (face, text, code bars, etc.), the customer will transfer information regarding the pre-processing results. For example, the client can indicate that the face recognition module is 75% sure that a particular subpart of the visual query contains a face. More generally, pre-processing results, if any, include one or more subject type values (eg barcode, face, text, etc.). Optionally, pre-processing results sent to the visual query server system include one or more of: for each subject type value in the pre-processing results, information identifying a sub-part of the visual query corresponding to the subject type value and, for each subject type value in the pre-processing results, a confidence value indicating a confidence level in the subject type value and/or identification of a corresponding sub-part of the visual query.
The initial interface server 110 receives the visual query from the client system (202). The received visual query may contain the pre-processing information discussed above. As stated, the initial interface server sends the visual query to a plurality of parallel search engines (210). If the home interface server 110 received pre-processing information regarding the probability that a subpart contained a subject of a certain type, the home interface server may transfer this information to one or more of the parallel search systems. For example, it can transfer information that a particular subpart is likely to be a face, so that the face recognition search system 112-A can process this subsection of the visual query first. Similarly, sending the same information (that a particular subpart is likely to be a face) can be used by other parallel search systems to ignore this subpart or analyze other subparts first. In some embodiments, the initial interface server will not transfer the pre-processing information to the parallel search systems, but instead will use this information to enhance the way in which it processes results received from the parallel search systems. .
As explained with reference to Fig. 2, for some visual queries, the initial interface server 110 receives a plurality of search results from the parallel search systems (214). Then, the home interface server can perform a variety of ranking and filtering, and can create an interactive search result document as explained in relation to figures 2 and 3. If the home interface server 110 has received pre-processing information in relation to the probability that a subpart contained a subject of a certain type, he can filter and sort, giving preference to those results that correspond to the pre-processed recognized subject type. If the user has indicated that a particular type of result was requested, the home UI server will take the user's requests into account when processing the results. For example, the home interface server might filter out all other results if the user requested only barcode information, or the home interface server will list all results that pertain to the requested type before listing the other results. If an interactive visual query document is resumed, the server can prefetch the links associated with the type of result the user indicated interest in, while only providing links to perform related searches for the other subjects indicated in the interactive results document. Then, the initial interface server 110 sends the search results to the client system (226).
Client 102 receives the results from the server system (412). When applicable, these results will include the results that correspond to the type of result found in the pre-processing stage. For example, in some modalities they will include one or more barcode results (414) or one or more face recognition results (416). If the client's preprocessor modules indicated that a particular type of result was likely, and this result was found, the results found of that type will be listed prominently.
Optionally, the user will select or annotate one or more of the results (418). The user can select a search result, can select a particular type of search result, and/or can select a portion of an interactive results document (420). Selecting a result is implicit feedback that the returned result was relevant to the query. Such feedback information can be used in future query processing operations. An annotation provides explicit feedback on the returned result that can also be used in future query processing operations. Annotations take the form of corrections to parts of the resumed result (such as a correction of a misrecognized word by OCR) or a separate annotation (either in free form or structured form).
User selection of a search result that selects, in general, the “correct” result from several of the same type (for example, choosing the correct result from a facial recognition server) is a process that is referred to as a selection between interpretations. User selection of a particular type of search result that generally selects the “type” result of interest from several different types of retrieved results (for example, choosing the OCR-recognized text of an article in a magazine rather than the visual results for the ads also on the same page) is a process that is referred to as intention disambiguation. A user can similarly select particular linked words (such as recognized named entities) in a document recognized by OCR as explained in detail with respect to Figure 8.
The user may wish to alternatively or additionally annotate particular search results. This annotation can be done in freeform style or in a structured format (422). Annotations can be result descriptions or they can be result analyses. For example, they might indicate the name of the subject(s) in the result or they might indicate “this is a good book” or “this product broke within a year of purchase”. Another example of an annotation is a user-drawn confinement box around a subpart of the visual query and user-provided text that identifies the object or subject within the confinement box. User annotations are explained in more detail compared to Figure 5.
User selections from search results and other annotations are sent to the server system (424). The initial interface server 110 receives the selections and annotations and further processes them (426). If the information was a selection of an object, subregion or term in an interactive results document, additional information regarding this selection may be requested as appropriate. For example, if the selection was for a visual result, more information about this visual result will be requested. If the selection was a word (from both the OCR server and the image server by terms), a textual search for this word will be sent to the query server system by term 118. If the selection was by a person from a system of search by facial image recognition, this person's profile will be prompted. If the selection was from a particular part of an interactive search result document, the inherent results of the visual query will be requested.
If the server system receives an annotation, the annotation is stored in a lookup and annotation database 116, explained in relation to Fig. 5. Then, information from annotation database 116 is periodically copied to databases of individual annotation for one or more of the parallel server systems, as discussed below in relation to Figures 7-10.
Figure 5 is a block diagram illustrating a client system 102 in accordance with an embodiment of the present invention. Typically, client system 102 includes one or more processing units (CPUs) 702, one or more network interfaces or other communications interfaces 704, memory 712, and one or more communication buses 714 to interconnect these components. The client system 102 includes a 705 user interface. The 705 user interface includes a 706 display device and optionally includes an input device such as a keyboard, mouse, or other input buttons 708. Alternatively, or in addition to moreover, the display device 706 includes a touch-sensitive surface 709, in which case, the screen 706 / 709 is a touch-sensitive screen. On client systems that have a 706 / 709 touchscreen, a physical keyboard is optional (for example, a software keyboard can be displayed when keyboard input is required). Furthermore, some client systems use a microphone and speech recognition to supplement or replace the keyboard. Optionally, the client 102 includes a GPS receiver (global positioning satellite) or other location sensing apparatus 707 to determine the location of the client system 102. In some embodiments, visual inquiry search services are provided that require the client system 102 provides the visual query server system to receive location information indicating the location of the client system 102.
The client system 102 also includes an image capture device 710, such as a camera or scanner. Memory 712 includes high-speed random access memory, such as DRAM, SRAM, DDR RAM, or other solid-state random access memory devices; and may include non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, or other non-volatile solid-state storage devices. The memory 712 may optionally include one or more storage devices remotely located with respect to the CPU(s) 702. The memory 712 or, alternatively, the non-volatile memory device(s) in memory 712 comprise a non-transient computer readable storage media. In some embodiments, memory 712 or the computer-readable storage media of memory 712 stores the following programs, modules, and data structures, or a subset thereof: • an operating system 716 that includes procedures for handling various basic system services and to perform hardware-dependent tasks',• a network communication module 718 which is used to connect the client system 102 to other computers via one or more network communication interfaces 704 (wired or wireless) and one or more communication networks, such as the Internet, other wide area networks, local area networks, metropolitan area networks and the like; • an image capture module 720 for processing a respective image capture by the capture device / camera of image 710, in which the respective image can be sent (for example, by a client application module) as a visual query to the visual query server system; the client application modules 722 for handling various aspects of image query, including, but not limited to: an image query submission module 724 for submitting visual queries to the visual query server system; optionally, a region of interest selection module 725 that detects a selection (such as a touch screen gesture 706 / 709) of a region of interest in an image and prepares this region of interest as a visual query; a results browser 726 for displaying the results of the visual query; and, optionally, an annotation module 728 with optional modules for 730 structured annotation text input, such as filling in a form or for 732 free-form annotation text input, which can accept annotations from a variety of formats , and an image region selection module 734 (sometimes referred to herein as the result selection module) that allows a user to select a particular subpart of an image for annotation; • an authoring application(s) optional content(s) 736 that allow(s) a user to author a visual query by creating or editing an image rather than just capturing it through the image capture device 710; optionally, one or more applications 736 may include instructions that enable a user to select a sub-part of an image for use as a visual query; • an optional local image analysis module 738 that preprocesses the visual query before sending it to the visual query server system. Local image analysis can recognize particular image types or sub-regions within an image. Examples of image types that can be recognized by such modules 738 include one or more of: face type (face image recognized in visual query), barcode type (bar code recognized in visual query), and text type (text recognized in the visual consultation); and• additional optional client applications 740, such as an email application, a phone application, a browser application, a mapping application, an instant messaging application, a social networking application, etc. In some embodiments, the application corresponding to an appropriate actionable search result can be started or called up when the actionable search result is selected.
Optionally, the image region selection module 734 which allows a user to select a particular subpart of an image for annotation also allows the user to choose a search result as a “correct” hit without necessarily additionally annotating it . For example, you can present the user with the first N facial recognition matches and choose the correct person from this list of results. For some search queries, more than one result type will be presented and the user will choose a result type. For example, the image query might include a person standing near a tree, but only the results for the person are of interest to the user. Therefore, the image selection module 734 allows the user to indicate which image type is the “correct” type, that is, the type he is interested in receiving. The user may also wish to annotate the search result by adding personal comments or descriptive words using either the 730 annotation text input module (for filling a form) or the 732 annotation text input module in free form.
In some embodiments, the optional local image analysis module 738 is a part of the client application (108, Figure 1). Furthermore, in some embodiments, optional local image analysis module 738 includes one or more programs to perform local image analysis to pre-process or categorize the visual query or a portion thereof. For example, client application 722 may recognize that the image contains a bar code, face or text before submitting the visual query to a search engine. In some embodiments, when the local image analysis module 738 detects that the visual query contains a particular type of image, the module asks the user if he is interested in a corresponding type of search result. For example, the local image analysis module 738 can detect a face based on its general characteristics (ie, without determining which person's face) and provides immediate feedback to the user prior to sending the query to the visual query server system. It may return a result such as "A face has been detected, are you interested in receiving facial recognition matches for this face " This can save time for the visual query server system (106, figure 1). For some visual queries, the initial interface visual query processing server (110, Figure 1) only sends the visual query to the search system 112 corresponding to the type of image recognized by the local image analysis module 738. In other embodiments, the visual query in search system 112 may send the visual query to all search systems 112A-N, but will rank results from search system 112 corresponding to the type of image recognized by local image analysis module 738. In some embodiments. , the way in which local image analysis impacts the operation of the visual query server system depends on the configuration of the client system or configuration or processing parameters associated with both the user and the client system. Furthermore, the actual content of any particular visual query and the results produced by local image analysis can cause different visual queries to be handled differently in both the client system and the visual query server system.
In some modalities, barcode recognition is performed in two steps, with analysis of whether the visual query includes a barcode performed on the client system in the local image analysis module 738. Then, the visual query is passed to a system barcode search only if the customer determines that the visual query is likely to include a barcode. In other embodiments, the barcode search system processes each visual query. Optionally, the client system 102 includes additional client applications 740.
Figure 6 is a block diagram illustrating an initial interface visual query processing server system 110 in accordance with an embodiment of the present invention. Typically, the initial interface server 110 includes one or more processing units (CPUs) 802, one or more network interfaces or other communications interfaces 804, memory 812, and one or more communication buses 814 to interconnect these components. Memory 812 includes high-speed random access memory, such as DRAM, SRAM, DDR RAM, or other solid-state random access memory devices; and may include non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, or other non-volatile solid-state storage devices. The 812 memory may optionally include one or more storage devices remotely located with respect to the 802 CPU(s). The 812 memory or, alternatively, the non-volatile memory device(s) in memory 812 comprise a non-transient computer readable storage media. In some embodiments, memory 812 or the computer-readable storage media of memory 812 stores the following programs, modules and data structures, or a subset thereof: • an operating system 816 that includes procedures for handling various basic system services and to perform hardware-dependent tasks',• an 818 network communication module which is used to connect the home interface server system 110 to other computers via one or more 804 network communication interfaces (wired or wireless) and one or more communication networks, such as the Internet, other wide area networks, local area networks, metropolitan area networks and the like; • a query manager 820 to handle incoming visual queries from the client system 102 and send them to two or more parallel search engines; as described elsewhere in this document, in some special situations, a visual query may be directed to only one of the search systems, such as when the visual query includes a client-generated instruction (for example, "facial recognition search only ”); • a results filter module 822 to optionally filter results from one or more parallel search systems and send the main results or “relevant” results to the client system 102 for presentation; • a ranking and formatting module 824 results to optionally rank results from one or more parallel search systems and to format the results for presentation; • an 826 results document creation module is used, where appropriate, to create an interactive search results document; module 826 may include sub-modules, including, but not limited to, a confinement box creation module 828 and a link creation module 830; • a label creation module 831 for creating labels that are visual identifiers of the respective sub-parts of a visual query; an annotation module 832 for receiving annotations from a user and sending them to an annotation database 116; a actionable search results module 838 for generating, in response to a visual query, an or more actionable search result elements, each configured to initiate a client-side action; examples of actionable search result elements are buttons to initiate a phone call, initiate email, map an address, make a restaurant reservation, and provide an option to purchase a product; and • a lookup and annotation database 116 comprising the database 834 itself and an index to the database 836.
The results ranking and formatting module 824 ranks the results retrieved from the one or more parallel search systems (112-A - 112-N, figure 1). As stated, for some visual queries, only results coming from a search engine may be relevant. In a case like this, only relevant search results coming from a search engine are ranked. For some visual queries, different types of search results may be relevant. In these cases, in some modalities, the results ranking and formatting module 824 ranks all results from the search system with the most relevant result (for example, the result with the highest relevance score) above the results for the systems less relevant search engines. In other embodiments, the results ranking and formatting module 824 ranks a top result from each relevant search engine above the remaining results. In some modalities, the results ranking and formatting module 824 ranks the results according to a computed relevance score for each of the search results. For some visual queries, augmented text queries are performed in addition to searching in parallel visual search systems. In some modalities, when textual queries are also performed, their results are presented in a visually distinctive way in relation to the results of the visual search system.
The 824 results ranking and formatting module also formats the results. In some modalities, results are presented in a list format. In some modalities, the results are presented through an interactive results document. In some modalities, both an interactive results document and a list of results are presented. In some modalities, the query type indicates how the results are presented. For example, if more than one searchable subject is detected in the visual query, then an interactive results document is produced, whereas if only one searchable subject is detected, the results will only be displayed in list format.
The 826 results document creation module is used to create an interactive search results document. The interactive search results document can have one or more subjects detected and searched. The confinement box creation module 828 creates a confinement box around one or more of the fetched subjects. Confinement boxes can be rectangular boxes or they can delineate the shape(s) of the subject(s). Link creation module 830 creates links to search results associated with their respective subject in the interactive search results document. In some modalities, clicking in the containment box area activates the corresponding link entered by the link creation module.
The query and annotation database 116 contains information that can be used to improve visual query results. In some modalities, the user can annotate the image after the results of the visual query are presented. Furthermore, in some modalities, the user can annotate the image before sending it to the search engine for visual consultation. Pre-annotation can aid in visual query processing by focusing results or performing text-based searches on annotated words in parallel with visual query searches. In some modalities, annotated versions of a figure can be made public (for example, when the user has given permission to publish, for example, by designating the image and annotation(s) as non-private), to be resumed as a potential image match hit. For example, if a user takes a picture of a flower and annotates the image giving detailed genus and species information about that flower, the user may want the image to be presented to anyone who performs a visual query search looking for that flower. In some embodiments, information from the lookup and annotation database 116 is periodically transferred to parallel search systems 112, which incorporate relevant pieces of information (if any) into their respective individual databases 114.
Figure 7 is a block diagram illustrating one of the parallel search systems used to process a visual query. Figure 7 illustrates a "generic" 112-N server system in accordance with an embodiment of the present invention. This server system is generic only in that it represents any of the 112-N visual query search servers. Typically, generic server system 112-N includes one or more processing units (CPUs) 502, one or more network interfaces or other communications interfaces 504, memory 512, and one or more communication buses 514 to interconnect these components. Memory 512 includes high-speed random access memory, such as DRAM, SRAM, DDR RAM, or other solid-state random access memory devices; and may include non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, or other non-volatile solid-state storage devices. The memory 512 may optionally include one or more storage devices remotely located with respect to the CPU(s) 502. The memory 512 or, alternatively, the non-volatile memory device(s) in memory 512 comprise a non-transient computer readable storage media. In some embodiments, memory 512 or the computer-readable storage media of memory 512 stores the following programs, modules and data structures, or a subset thereof: • an operating system 516 that includes procedures for handling various basic system services and to perform hardware-dependent tasks,• a 518 network communication module that is used to connect the 112-N generic server system to other computers via one or more 504 network communication interfaces (wired or wireless) and one or more communication networks, such as the Internet, other wide area networks, local area networks, metropolitan area networks and the like; • a search application specific to the particular server system, which can be, for example , a barcode search application, a color recognition search application, a product recognition search application, an object or category search application of object, or the like; • an optional index 522 if the particular search application uses an index; • an optional image database 524 for storing the images relevant to the particular search application, in which the image data stored, if any, depend on the type of search process; • an optional results ranking module 526 (sometimes called a relevance score definition module) to rank results from the search application; the 5 ranking module can assign a relevance score to each result of the search application and, if no result reaches a pre-defined minimum score, it can return a null or zero value score to the visual query processing server in initial interface which indicates that the results from this server system are not relevant; and • an annotation module 528 for receiving annotation information from an annotation database (116, Figure 1) that determines whether any of the annotation information is relevant to the particular search application and to incorporate all relevant parts of the annotation information determined in the respective annotation database 530.
Figure 8 is a block diagram illustrating an OCR 112-B search system used to process a visual query in accordance with an embodiment of the present invention. Typically, the OCR 112-B search system includes one or more processing units (CPUs) 602, one or more network interfaces or other communications interfaces 604, memory 612, and one or more communication buses 614 to interconnect these components. . Memory 612 includes high-speed random access memory, such as DRAM, SRAM, DDR RAM, or other solid-state random access memory devices; and may include non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, or other non-volatile solid-state storage devices. The memory 612 may optionally include one or more storage devices remotely located with respect to the CPU(s) 602. The memory 612 or, alternatively, the non-volatile memory device(s) in memory 612 comprise a non-transient computer readable storage media. In some embodiments, memory 612 or the computer-readable storage media of memory 612 stores the following programs, modules and data structures, or a subset thereof: • an operating system 616 that includes procedures for handling various basic system services and to perform hardware-dependent tasks',• a 618 network communication module that is used to connect the 112-B OCR search engine to other computers via one or more 604 network communication interfaces (wired or wireless ) and one or more communication networks, such as the Internet, other wide area networks, local area networks, metropolitan area networks and the like; • an Optical Character Recognition (OCR) module 620 that tries to recognize text in the query visual and converts the images from letters to characters; • an optional OCR database 114-B which is used by the OCR module 620 to recognize fonts, text patterns and other characteristics in particular ex letter recognition uniques; • an optional spell checking module 622 that improves the conversion of letter images to characters by checking the converted words against a dictionary and substituting potentially poorly converted letters into words that otherwise match a dictionary word; • an optional named entity recognition module 624 that searches named entities in the converted text sends the named entities recognized as terms in a term query to the term query server system (118, figure 1) and provides the results from the term lookup server system as embedded links in the OCR recognized text associated with the recognized named entities; • an optional text matching application 632 that improves the conversion of letter-to-character images by scanning converted segments (such as as converted sentences and paragraphs) in relation to a b ase of text segment data and potentially misconverted letter substitution into OCR-recognized text segments that would otherwise match a text segment from the text matching application; in some embodiments, the text segment found by the text matching application is provided as a link to the user (for example, if the user has scanned a page from the New York Times, the text matching application can provide a link to the entire article posted on the New York Times website');• a ranking and formatting of results module 626 to format the results recognized by OCR for presentation and formatting of optional links to named entities and also optionally rank all results related from the text matching application; and • an optional annotation module 628 for receiving annotation information from an annotation database (116, Figure 1) that determines whether any of the annotation information is relevant to the OCR search engine and to incorporate all parts of the annotation information determined in the respective annotation database 630.
Figure 9 is a block diagram illustrating a facial recognition search system 112-A used to process a visual query with at least one facial image in accordance with an embodiment of the present invention. Typically, the facial recognition search system 112-A includes one or more processing units (CPUs) 902, one or more network interfaces or other communications interfaces 904, memory 912 and one or more communication buses 914 to interconnect these. components. Memory 912 includes high-speed random access memory, such as DRAM, SRAM, DDR RAM, or other solid-state random access memory devices; and may include non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, or other non-volatile solid-state storage devices. The 912 memory may optionally include one or more storage devices remotely located with respect to the CPU(s) 902. The 912 memory or, alternatively, the non-volatile memory device(s) in memory 912 comprise a non-transient computer readable storage media. In some embodiments, memory 912 or the computer-readable storage media of memory 912 stores the following programs, modules, and data structures, or a subset thereof: • an operating system 916 that includes procedures for handling various basic system services and to perform hardware-dependent tasks',• a 918 network communication module which is used to connect the 112-A face recognition search system to other computers via one or more 904 network communication interfaces (wired or wireless) and one or more communication networks, such as the Internet, other wide area networks, local area networks, metropolitan area networks and the like; • a facial recognition search application 920 that includes a visual identifier module 924 to identify potential image matches that potentially match a facial image in the query, a personal identifier module 926 for identifying people with associated with the potential image matches and a social connection measures module 928 to retrieve person-specific data comprising measures of social connectivity to the requester (and/or another person in the image) and a ranking module 930 to generate a ranked list of people identified according to measures of visual similarity between the facial image and potential matches, as well as according to measures of social connectedness; • a 114-A facial image database, which is searched to find the images that potentially match a facial image in a query, includes one or more image sources, such as social network images 932, Internet album images 934, photo sharing images 936, and preview images 938. image sources used in response to a particular query are identified according to data regarding the requestor. In some modalities, they only include images in accounts that belong or are associated with the requester, such as the requester's social network accounts, the requester's Internet albums, and the like. In other modalities, sources include images that belong or are associated with other people with whom the requester is socially connected, for example, people with a direct connection to a requester in a social graph. Optionally, the facial image database 114-A includes images of famous people 940. In some embodiments, the facial image database includes facial images obtained from external sources, such as facial image resellers who are legally in the country. public domain; • an image resource extractor 942 extracts features derived from the images in the facial image database 114-A and stores the information in a person-specific database 964. In some embodiments, visual features such as such as an indoor habitat factor, an outdoor habitat factor, a gender factor, a race factor, a glasses factor, a facial hair factor, a hair factor, a headdress factor, an eye color factor, occurrence information, and co-occurrence information are extracted with a visual aid extractor 944. In some embodiments, metadata features such as date information, h information time and location information are extracted with a metadata resource extractor 946; • public databases 948 are person-specific data sources, which include measures of social connectivity connection between the person associated with a potential image match and the applicant. The data is obtained from a plurality of applications including, but not limited to, 922 social network databases, 950 social microblog databases, 952 blog databases, 954 email databases, databases IM database 956, calendar databases 958, contact lists 960 and/or public URLs 962; • a person-specific database 964 that stores information specific to particular persons. Some or all of the person-specific data is obtained from public databases. Person-specific data is described in more detail with reference to Figures 18A-C; • a results formatting module 966 to format the results for presentation; in some embodiments, formatted results include potential image matches and subsets of information from the person-specific database 964; an annotation module 968 for receiving annotation information from an annotation database ( 116, Figure 1) to determine whether any of the annotation information is relevant to the facial recognition search system and to store all relevant parts of the determined annotation information in the respective annotation database 970; person 972 acquires location information relating to the applicant's current location and one or more persons identified as potential matches to a facial image in a visual query. The acquisition of the location information by the person location module 972 and the use of the location information to improve a person's matching to a facial image by the search application 920 are discussed below in relation to Figures 16A, 17, 18A and 18C.
Figure 10 is a block diagram illustrating an image search system 112-C used to process a visual query in accordance with an embodiment of the present invention. In some modalities, the image search system by terms recognizes objects (instance recognition) in the visual query. In other modalities, the image search system by terms recognizes object categories (type recognition) in the visual query. In some embodiments, the term imaging system recognizes both objects and object categories. The image search by terms retrieves potential term matches for images in the visual query. Typically, the term image search system 112-C includes one or more processing units (CPUs) 1002, one or more network interfaces or other communications interfaces 1004, memory 1012 and one or more communication buses 1014 for interconnecting these components. Memory 1012 includes high-speed random access memory, such as DRAM, SRAM, DDR RAM, or other solid-state random access memory devices; and may include non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, or other non-volatile solid-state storage devices. The memory 1012 may optionally include one or more storage devices remotely located with respect to the CPU(s) 1002. The memory 1012 or, alternatively, the non-volatile memory device(s) in memory 1012 comprise a non-transient computer readable storage media. In some embodiments, memory 1012 or the computer-readable storage media of memory 1012 stores the following programs, modules and data structures, or a subset thereof: • an operating system 1016 that includes procedures for handling various basic system services and to perform hardware-dependent tasks',• a 1018 network communication module which is used to connect the image search system by 112-C terms to other computers via one or more 1004 network communication interfaces (wired or wireless) and one or more communication networks, such as the Internet, other wide area networks, local area networks, metropolitan area networks and the like; • an image search application for 1020 terms that searches for images that match to the subject or subjects in the visual query in the 114-C image search database; a 114-C image search database that can be searched by the search application 1020 to find images similar to the subject(s) of the visual query; • an inverse index of terms per image 1022, which stores the textual terms used by users when searching for images using a text-based query search engine 1006; • a ranking and formatting results module 1024 to rank the potential image matches and/or rank terms associated with the potential image matches identified in the inverse image terms index 1022; and • an annotation module 1026 for receiving annotation information from an annotation database (116, Figure 1) that determines whether any of the annotation information is relevant to the image search system by 112-C and to store all relevant parts of the determined annotation information in the respective annotation database 1028.
Figures 5 - 10 are intended to be understood more as functional descriptions of the various resources that may be present in a set of computer systems than as a structural schematic representation of the modalities described here. In practice, and as recognized by those skilled in the art, items shown separately can be combined and some items can be separated. For example, some items shown separately in these figures can be deployed to individual servers and individual items can be deployed by one or more servers. The actual number of systems used to implement visual query processing and how resources are allocated between them will vary from one implementation to another.
Each of the methods described herein can be managed by instructions that are stored on non-transient computer-readable storage media and that are executed by one or more processors of one or more servers or clients. The above-identified modules or programs (ie instruction sets) need not be implemented as separate software programs, procedures, or modules, and thus various subsets of these modules can be combined or otherwise rearranged into various modalities. Each of the operations shown in figures 5-10 can correspond to instructions stored in computer memory or non-transient computer readable storage media.
Figure 11 illustrates a client system 102 with a screenshot of an exemplary visual query 1102. The client system 102 shown in Figure 11 is a mobile device, such as a cell phone, portable music player, or portable email device. The client system 102 includes a screen 706 and one or more input devices 708, such as the buttons shown in this figure. In some embodiments, the 706 screen is a 709 touch screen. In embodiments with a 709 touch screen, software buttons displayed on the 709 screen can optionally replace some or all of the 708 electromechanical buttons. useful in interacting with the results of the visual query, as explained in more detail below. The client system 102 also includes an image capture mechanism, such as a camera 710.
Figure 11 illustrates a visual query 1102, which is a photograph or video frame of a package on a store shelf. In the modalities described here, the visual query is a two-dimensional image with a resolution corresponding to the size of the visual query in pixels in each of the two dimensions. In this example, visual query 1102 is a two-dimensional image of three-dimensional objects. Visual lookup 1102 includes background elements, a product package 1104, and a variety of types of entities on the package that include a picture of a person 1106, a picture of a trademark 1108, a picture of a product 1110, and a variety of textual elements 1112.
As explained with reference to Fig. 3, visual query 1102 is sent to initial interface server 110, which sends visual query 1102 to a plurality of parallel search systems (112A-N), receives the results and creates a document. interactive results.
Figures 12A and 12B each illustrate a client system 102 with a screen capture of one embodiment of an interactive results document 1200. The interactive results document 1200 includes one or more visual identifiers 1202 of respective sub-parts of the visual query 1102 that, each includes a user-selectable link to a subset of search results. Figures 12A and 12B illustrate an interactive results document 1200 with visual identifiers that are containment boxes 1202 (eg, containment boxes 1202-1, 1202-2, 1202-3). In the embodiments shown in Figures 12A and 12B, the user activates the display of search results corresponding to a particular subpart by tapping the activation region within the space delineated by its confinement box 1202. For example, the user will activate the search results matching the image of the person tapping a confinement box 1306 (figure 13) that surrounds the image of the person. In other modalities, the selectable link is selected using a mouse or keyboard instead of a touchscreen. In some embodiments, the first matching search result is displayed when a user previews a containment box 1202 (that is, when the user clicks, taps or hovers a pointer over the containment box). The user activates the display of a plurality of matching search results when the user selects the containment box (ie, when the user double-clicks, taps twice, or uses another mechanism to indicate selection).
In Figures 12A and 12B, the visual identifiers are confinement boxes 1202 that surround sub-parts of the visual query. Figure 12A illustrates confinement boxes 1202 that are square or rectangular. Figure 12B illustrates a containment box 1202 that outlines the outline of an identifiable entity in the visual query subpart, such as a containment box 1202-3 for a beverage bottle. In some embodiments, a respective containment box 1202 includes smaller containment boxes 1202 itself. For example, in Figures 12A and 12B, the containment box identifying package 1202-1 surrounds the containment box identifying trademark 1202-2 and all other containment boxes 1202. Some embodiments that include text also include links active quick 1204 for some of the textual terms. Figure 12B shows an example where “Active Drink' and “United States” are displayed as 1204 quick links. The search results matching these terms are the results received from the query server system by term 118, while the search results match these terms. results corresponding to the confinement boxes are query results by image search systems.
Figure 13 illustrates a client system 102 with a screen capture of an interactive results document 1200 that is encoded by entity type recognized in the visual query. The visual query of Figure 11 contains an image of a person 1106, an image of a trademark 1108, an image of a product 1110, and a variety of text elements 1112. As such, the interactive results document 1200 shown in Figure 13 includes confinement boxes 1202 around a person 1306, a trademark 1308, a product 1310, and the two text areas 1312. Each of the confinement boxes in Figure 13 is shown with separate hatching representing differently colored transparent confinement boxes 1202. In some embodiments, the visual identifiers of the confinement boxes (and/or labels or other visual identifiers in the interactive results document 1200) are formatted for presentation in visually distinctive ways, such as overlay color, overlay pattern, color of label background, label background pattern, label font color, and confinement box border color. Type coding for particular recognized entities is shown in relation to confinement boxes in Figure 13, but type coding can also be applied to visual identifiers that are labels.
Figure 14 illustrates a client device 102 with a screen capture of an interactive results document 1200, with labels 1402 being the visual identifiers of the respective sub-parts of the visual query 1102 of Figure 11. Each of the visual label identifiers 1402 includes a link user-selectable to a subset of matching search results. In some embodiments, the selectable link is identified by descriptive text displayed in the label area 1402. Some embodiments include a plurality of links in a label 1402. For example, in Figure 14, the label hovering over the image of a woman drinking includes a link to the facial recognition results for the woman and a link to the image recognition results for this particular figure (eg images of other products or advertisements using the same figure).
In Figure 14, the 1402 labels are displayed as partially transparent areas with text that are located over their respective subparts of the interactive results document. In other modalities, a respective label is positioned near to, but not located over, its respective subpart of the interactive results document. In some embodiments, the labels are coded by type, in the same manner discussed in connection with Figure 13. In some embodiments, the user activates the display of search results corresponding to a particular subpart corresponding to a label 1302 by tapping the region of activation within the space delineated by the edges or periphery of the label 1302. The same preview and selection functions discussed above in connection with the confinement boxes of Figures 12A and 12B also apply to the visual identifiers that are labels 1402.
Figure 15 illustrates a screen capture of an interactive results document 1200 and the original visual query 1102 displayed concurrently with a results list 1500. In some embodiments, the interactive results document 1200 is displayed by itself, as shown in figures 12-14. In other embodiments, the interactive results document 1200 is displayed concurrently with the original visual query, as shown in Figure 15. In some embodiments, the visual query results list 1500 is displayed concurrently with the original visual query 1102 and/ or interactive results document 1200. The type of client system and the amount of screen space 706 can determine whether results list 1500 is displayed concurrently with interactive results document 1200. In some embodiments, client system 102 receives ( in response to a visual query submitted to the visual query server system) both results list 1500 and interactive results document 1200, but only displays results list 1500 when the user scrolls down interactive results document 1200. In some of these embodiments, the client system 102 displays the results corresponding to a visual identifier selected by the user 1202 / 1402 without having to consult the server again because the result list 1500 is received by the client system 102 in response to the visual query and, then stored locally on the client system 102.
In some modalities, the 1500 result list is organized into 1502 categories. Each category contains at least one 1503 result. In some modalities, category titles are highlighted to distinguish them from 1503 results. their calculated category weights. In some disciplines, the category weight is a combination of the weights of the highest N scores in that category. As such, the category that probably produced the most relevant results is displayed first. In modalities where more than one category 1502 is resumed for the same recognized entity (such as the facial image recognition match and the image match shown in Figure 15). The category displayed first has a higher category weight.
As explained in relation to Figure 3, in some embodiments, when a selectable link in the interactive results document 1200 is selected by a user of the client system 102, the cursor will automatically move to the appropriate category 1502 or to the first result 1503 in this category. Alternatively, when a selectable link in the interactive results document is selected by a user of the client system 102, the result list 1500 is reordered such that the category or categories relevant to the selected link are displayed first. This is done, for example, either by encoding the selectable links with information identifying the corresponding search results or by encoding the search results to indicate the corresponding selectable links or to indicate the corresponding categories of the result.
In some modalities, the search result categories correspond to the image query search system that produces these search results. For example, in Figure 15, some of the categories are product match 1506, logo match 1508, facial recognition match 1510, and image match 1512. The original visual query 1102 and/or an interactive results document 1200 can be similarly displayed with a category title, such as query 1504. Similarly, results from any search by term performed by the query by term server may also be displayed as a separate category, such as Internet results 1514. In other modalities, more than an entity in a visual query will produce results from the same image query search engine. For example, the visual query may include two different faces that will retrieve separate results from the 112-A facial recognition search system. As such, in some embodiments, categories 1502 are divided by recognized entity rather than by search engine. In some embodiments, an image of the recognized entity is displayed in the recognized entity category header 1502 such that the results for this recognized entity are distinguishable from the results for another recognized entity, even though both results are produced by the same query by image search system. For example, in Figure 15, product matching category 1506 includes two product entities and as such two entity categories 1502, a boxed product 1516 and a bottled product 1518, each of which has a plurality of corresponding results. search 1503. In some modalities, categories can be divided by recognized entities and type of image query system. For example, in Figure 15, there are two separate entities that have returned relevant results under the product matching category.
In some modalities, the 1503 results include thumbnail images. For example, as shown for the facial recognition match results in Figure 15, small versions (also called thumbnail images) of the facial match figures for “Actress X” and “Social Network Friend Y” are displayed along with some textual description, such as the person's name in the image.
Figures 16A - 16B are flowcharts illustrating a process of responding to a visual query that includes a facial image in accordance with some embodiments. Each of the operations shown in these figures can correspond to instructions stored in computer memory or non-transient computer readable storage media. Facial recognition search system 112-A receives, from a requester, a visual query with one or more facial images itself (1602). In some embodiments, the fact that the visual query contains at least one face is determined by the initial interface visual query processing server 110. In other words, when a visual query is processed by the facial recognition search system 112-A , at least a portion of the visual query image was determined to contain a potential face. In some circumstances, the visual query contains a plurality of faces, such as a picture of two or more friends or a group photo of several people. In some cases where the visual query comprises a plurality of facial images, the requester may only be interested in one of the faces. As such, in some modalities, when the visual query includes at least one respective facial image and a second facial image, before identifying potential image matches, the system receives a selection of the respective facial image from the requester. For example, in some modalities, the system identifies each potential face and requests confirmation regarding which face(s) in the query the requester wants to be identified.
Images that potentially correspond to a respective facial image are identified (1604). These images are called potential image matches. Potential image matches are identified according to visual similarity criteria. Also, potential image matches are identified from one or more image sources identified according to data regarding the requester (1606). In some modalities, data regarding the applicant is obtained from the profile information of an applicant. In some modalities, the applicant's profile information is obtained directly from the applicant. Alternatively, or in addition, the applicant's profile information is received from a social network. Potential image matches include images that are tagged, that is, images that include personal identifiers for the person or people in the images. In some embodiments, the one or more image sources include images from an applicant's social networking database(s), Internet album(s), photo sharing database(s), and others image sources associated with the requestor. Furthermore, in some embodiments, a database (940, figure 9) of images of famous people is also included in the image sources searched for potential image matches. In some embodiments, the image sources fetched for potential image matches also include images from the social network database(s) of the requester's friends or contacts, the Internet album(s), the photo sharing database(s) and other image sources associated with the applicant. In modalities that include images from databases of friends or contacts of an applicant, a determination is made as to which databases to include. For example, in some modalities, databases of a maximum number of friends or predetermined contacts are included. In other modalities, only databases of direct friends of the social network are included.
Then, one or more people associated with the potential image matches are identified (1608). In some embodiments, the one or more people are identified from personal identifier tags associated with the identified image matches. For example, the system can identify that Bob Smith, Joe Jones, and Peter Johnson are people associated with potential image matches for a query that includes an image of a male friend because these three people have been tagged in other images associated with the applicant, and these three people are visually similar to the facial image in the consultation.
For each identified person, person-specific data is retrieved that includes measures of social connectedness obtained from a plurality of applications (1610). The plurality of applications includes communication applications, social networking applications, calendar applications and collaborative applications (1612). For example, applications can include applications such as Facebook, Twitter, Buzz, G-mail (email and IM), Internet calendars, blogs such as “LiveJournal”, personal public URLs and any contact lists associated with them . In some modalities, data is only obtained from “public” information published in these applications. In other modalities, data is obtained if it belongs to or is explicitly shared with the requester. In some embodiments, person-specific data includes name, address, occupation, group membership, interests, age, hometown, personal statistics, and work information for the respective identified person (as discussed in more detail with reference to Figure 18A ). In some embodiments, this information is compiled from one or more of the aforementioned applications.
Person-specific data includes social connectedness measures, which are measures of social connectedness between the respective identified person and the requester (1614). In some embodiments, social connectivity measures include measures of social connectivity in relation to one or more of the aforementioned applications. For example, social connectivity measures may take into account one or more of: whether the respective identified person and the requester are friends on a social networking website, the amount (if any) of email and/or IM messages exchanged by the requester and his or her Identifier and whether the requester and the respective Identifier follow each other's social microblog posts, etc.
In some embodiments, person-specific data for a respective identified person also includes features derived from other images of the respective person (1616). In some embodiments, these ’ features include image metadata information, such as date information, time information, and location information. In other embodiments, characteristics derived from other images of the respective person comprise visual factors such as an indoor habitat factor, an outdoor habitat factor, a sex factor, a race factor, a factor of glasses, a facial hair factor, a hair factor, a headdress factor and an eye color factor. In still other embodiments, features derived from other images of the respective person include information of occurrences in relation to a number of occurrences of the respective person in one or more image sources and/or information in relation to a number of co-occurrences of the respective person and with a second person on images from the one or more image sources.
Optionally, in some embodiments, current location information for the requester and current location information for a respective identified person are obtained (1618) by person location module 972 (figure 9). For example, the current locations of both the requester and the respective identified person can be obtained from a GPS receiver located on a mobile device, from an IP address of the desktop device used by the person, from a home address or the person's work address or from a person's published location (such as, “I am currently at a conference in Boston”).
Then, an ordered list of people is generated by ranking the one or more people identified according to one or more measures of visual similarity between the respective facial image and the potential image matches, and also according to ranking information that comprises at least the measures of social connection (1620). These and other factors that affect ranking are discussed in more detail below in relation to Figure 17.
The process continues as shown in Figure 16B. Optionally, a membership list is checked and a determination is made whether one or more person identifiers are releasable to the applicant (1622). In some modalities, this verification is done when the potentially corresponding image(s) come from a source other than the requester's own account(s) or when the requester's own accounts do not contain tagged images of the respective identified person.
The requester is then sent at least one person identifier from the ordered list (1624), thereby identifying one or more persons. In some embodiments, the person identifier is a name. In other modalities, the person identifier is an identifier, email address, nickname or the like. In some arrangements, a representative picture, such as a profile picture, an image of the identified person that best matches the visual query, is sent along with the person identifier. In such modalities, when more than one person is identified as a potential match, a representative picture of each identified person is sent along with the response to the image query. In some modalities, additional information, such as contact information or a clipping from a recent public post, is also sent with the person identifier. In other modalities, in addition to the person identifier, the connection found between the requester and the person in the image is also retrieved. For example, a ranked result of Joe Smith might include the statement “Joe
Smith is listed as a contact on more than one of your accounts" or "Both you and Joe Smith are members of the Palo Alto Tennis Club" or "Both you and Joe Smith are friends with Karen Jones." Additional information, such as the person's contact information, group affiliations, and the names of the people between the requester and the person in the image with matching according to the social graph, can be included in the results returned to the requester. In some modalities, the augmented information presented to the requester is explicitly or implicitly specified by the requester (for example, by configuration values in its profile, by parameters in the visual query, or by the type of the visual query). In some arrangements, when more than one person identifier is sent to the requester, more information is provided for the first identified ranked persons than for the last identified ranked persons.
In some embodiments, a copy of the visual query (or part of the query with the respective facial image) is also sent with the one or more person identifiers (1626). When more than one facial image was in the original visual consultation and one or more facial images are positively identified, in some modalities, a copy of the visual consultation is also sent to one or more of the people identified in the visual consultation. Thus, if a group photograph is taken and multiple people want copies of it, the requester does not need to find contact information for them and manually sends them a copy of the photograph. In some arrangements, an applicant must first verify which copies must be sent to one or more of the identified persons before they are sent.
In some arrangements, a selection of a personal identifier is received from the requestor (1628). Then, in response to the selection, data corresponding to the selected person identifier is sent to the requester (1630). In some embodiments, this data includes one or more images associated with the person identifier, contact information associated with the person identifier, public profile information associated with the person identifier, etc. In some arrangements, the applicant is given the option to store some or all of this information in the applicant's contact list or to update the applicant's contact information for the identified person. In some modalities, the information is associated with the requester's visual query or the part of the query with the facial image corresponding to the person identifier is stored with the contact list information.
Furthermore, in some embodiments, the facial image of the visual query is stored as an additional image of a respective person corresponding to the selected person identifier (1632). In some embodiments, the image is stored in a preview part of the image sources (938, figure 9). In some modalities, the requester is given an opportunity to annotate the image to include additional data. In cases where annotation data is entered by the requester, it is received and stored (1634) by the 112-A facial recognition search system. The annotation module (968, Figure 9) accepts annotations to improve future facial recognition searches. For example, if the user annotates a picture of a person with that person's name, this picture can be used in future facial recognition queries to recognize the person. In some modalities, for privacy reasons, a person's additional annotated figures can be used by the 112-A facial recognition search system to extend the facial recognition process, but are not returned as an image result to anyone except the original applicant. In some modalities, only the real person identified in the visual consultation is allowed to take a public image (or available to people other than the requester). In some arrangements, once the person is positively identified, a request is sent to that person asking them if they will allow the image to be resumed as a result for future queries to people on their social network.
In some embodiments, more than one image of the same person can be retrieved in step 1604. Once the potentially corresponding images are retrieved and it is determined that the images are of the same person, this can be done by noting that both images have the same personal ID, the same or similar person-specific data (name, address and the like) or have the same or similar social connections, the images will be associated with the same data and treated as a single unit for the rest of the steps of processing. Optionally, if two or more images are retrieved with the same person identifier, in step 1624, more than one image retrieved for the same person identifier is retrieved in response to the image query.
Figure 17 is a flowchart illustrating factors and characteristics used in generating an ordered list of people who potentially match a facial image in a visual query. This flowchart provides more information regarding step 1620 discussed above.
In some modalities, several factors are used in determining a ranking score for a respective person in the ordered list of people according to social network connection measures (1702). In some arrangements, an amount of communication between a respective person and the requester in the one or more communication applications is determined and then a ranking score is determined for the respective person, where a factor in determining the ranking score for the respective person is the amount of communication determined between the respective person and the requester in the one or more communication applications (1704). Communications applications can include social networking applications, social microblogs, email applications and/or instant messaging applications. For example, if a respective person has communicated extensively with the applicant through one or more communications applications (eg extensive email communications and social networking posts), then it is likely that the applicant knows the respective person very well and thus the facial image in the visual consultation is more likely to be the respective person. In some modalities, this factor is only used when the amount of communication is above a predetermined threshold (for example, a defined number of communications, a number of communications in a certain period of time, or a percentage of total communications). In some embodiments, the 112-A facial recognition search system determines whether the amount of communication between the respective person and the requester in one or more communication applications exceeds a threshold, and a factor in determining the ranking score for the respective person is the determination of whether the amount of communication between the respective person and the requester in one or more communication applications exceeds the threshold.
In some arrangements, a determination of whether the requester and a respective person are directly connected in a respective social network application is made and then a ranking score for the respective person is determined, in which a factor in determining the ranking score for the respective person is the determination of whether the requester and the respective person are directly connected in a respective social network application (1706). For example, if the applicant and the respective person are directly connected as friends, then the applicant is likely to know the respective person very well and thus the facial image in the visual query is likely to be the respective person.
In cases where the person-specific data for the respective person includes a plurality of characteristics, such as two or more of: name, address, occupation, group membership, interests, age, hometown, personal statistics and/or information of work for the respective person, the same information is also retrieved for the applicant to the extent that such information is available to the 112-A facial recognition search system. Then, one or more measures of personal similarity are determined according to a threshold in which the person-specific data of the requester is similar to the person-specific data of the respective identified person. A ranking score for the respective identified person is determined, where one or more factors in determining the ranking score for the respective identified person are the one or more measures of personal similarity (1708). For example, if the applicant and the respective person are of similar ages, similar occupations and are members of similar groups, they are more likely to be friends and thus the facial image in the visual consultation is likely to be the respective person.
In circumstances where current location information for both the requester and the identified person is successfully obtained, a ranking score for the respective identified person is determined, where a factor in determining the ranking score for the respective identified person is whether current location information for the applicant matches the current location information for the respective identified person (1710). For example, when both the requester and the respective person are determined to be in the same location, this proximity increases the probability that the facial image in the visual consultation is the respective person. Furthermore, when it is determined that the applicant and the respective person are not in the same location, the lack of proximity greatly decreases the probability that the facial image in the visual consultation is the respective person. Furthermore, in some modalities, a history or location record for both the requester and the identified person is retrieved and compared against each other in relation to a match. In some embodiments, the location records of the requester and the identified person are further compared with a location feature (and/or date and time) derived from the query image itself. For example, if the query location information indicates that the image was taken on July 2nd in Santa Cruz, CA, and the location records for both the applicant and the identified person also indicate that they were in Santa Cruz, CA on July 2, then, this location match increases the likelihood that the facial image in the visual query is that of the respective person.
In modalities where the person-specific data for a respective person also comprises characteristics derived from other images of the respective person (which was discussed in relation to step 1616), the ranking is additionally according to the similarity between the received query. and characteristics derived from other images of the respective person (1712). Various factors are used in determining the ranking score for a respective person that are in accordance with these characteristics derived from other images of the respective person (1714).
In some embodiments, characteristics derived from other images of the respective person include date information (for example, day of week, day of month and/or full date) and time of image capture. Then, one or more similarity measures are determined according to a degree to which the received query has image capture date and time information similar to the date and time information of one or more other images of the respective person. A ranking score for the respective person is determined, where one or more factors in determining the ranking score for the respective person are the one or more similarity measures (1716). In some modalities, the similarity measure is a Boolean value (eg yes/no or 1/0). In other modalities, a similarity measure is a vector of Boolean values (eg same date yes/no, 1 hour yes/no, 5 hours yes/no, etc.). It can be a numeric value (for example, between 0 and 1) that measures similarity. In some modalities, the similarity measure is determined for each other image of the respective person, but in some modalities, a group value for all images of the respective person is determined. In some embodiments, another feature derived from the images is place/location information, which can be used as an additional or alternative similarity measure, as discussed above. For example, if the visual query has date, time and/or location information similar to one or more other images, this similarity increases the likelihood that the facial image in the visual query is the respective person who was in the one or more other images with similar date, time and/or location information.
In some embodiments, features derived from other images of the respective person include event information regarding a number of occurrences of the respective person in the images from the one or more image sources. In some of these modalities, a factor in determining the ranking score for the respective person is the occurrence information for the respective person (1718). For example, if numerous other images include the respective person, then the applicant is likely to know the respective person very well, which increases the likelihood that the facial image in the visual consultation is that of the respective person.
In some embodiments, characteristics derived from other images of the respective person include visual factors including one or more of: an indoor habitat factor, an outdoor habitat factor, a sex factor, a race factor, a glasses factor, a facial hair factor, a hair factor, a headdress factor, a clothing factor, and an eye color factor. In some of these modalities, one or more factors in determining the ranking score for the respective person includes the visual factors for the respective person (1720).
In some situations, the visual query includes a plurality of facial images. When more than one facial image is in the visual consultation, then interconnections between them can be useful in correctly identifying them. For example, if they have strong measures of social connectedness or appear together in other images, these facts increase the likelihood that they are also together in the query image. In some embodiments, the visual query includes at least one respective facial image and a second facial image. Images (here called potential second image matches) that potentially match the second facial image according to visual similarity criteria are identified. Potential second image matches are images from one or more image sources identified according to data regarding the requester. Then, a second person associated with the potential second image matches is identified. For purposes of this determination, the second person is considered to be identified with a high degree of security. For each person identified as a potential match to the respective facial image, person-specific data that includes second social connectedness measures of second-person social connectivity are obtained from the plurality of applications. Then, an ordered list of people is generated by ranking the one or more additionally identified people according to ranking information that includes at least the second measures of social connection. As such, a respective ranking of the person is additionally according to second social connectedness measures which comprise measures of social connectedness to a second person in the consultation (1722). In other words, in some modalities, both social connections to the requester and social connections to the second person are used in generating the ordered list of people.
In other modalities, one or more of the other factors discussed above are compared between the second person and each person identified as a potential match to find a better match. For example, if the second person and a respective person are employed in the same company, appear in other images that have similar date/time information, or communicate extensively with each other, then these factors can be used to correctly identify them. In another example, features derived from other images of the respective person include information regarding a number of co-occurrences of the respective person and the second person in images from the one or more image sources; and when a ranking score for the respective person is determined, one factor in determining the ranking score for the respective person is the amount of person and second person co-occurrences in images from the one or more image sources (1724).
Fig. 18A is a block diagram illustrating a portion of the data structure of a facial image database 114-A used by the facial recognition search system 112-A. In some embodiments, the facial image database contains one or more images of a person 1802 obtained from one or more image sources identified in accordance with data relating to the requester. In some embodiments, facial image database 114-A also contains a unique ID 1804, or person identifier, for the person. Additional information regarding the person is associated with the person identifier 1804 and is stored in a person-specific database 964.
Then, some or all of the additional information is used in determining potential matches for a facial image in a visual query. For example, an ordered list of identified people associated with potential image matches is generated by ranking the people according to measures of social connectivity to the requestor, such as membership in the match group 1812 or strong social connections 1814. Person-specific data data 964 is used in addition to the potential image that is visually similar to the facial image in the visual query when determining an ordered list of identified persons. The person-specific database 964 may include, but is not limited to, any of the following items for the person identified by unique ID 1804: name 1806, address 1808, occupation 1810, memberships in group 1812, social network connections 1814 (explained in more detail with reference to figure 18B), current location 1816, sharing preferences 1818, interests 1820, age 1822, hometown 1824, personal statistic 1826, job information 1828. This information is obtained from a plurality of applications such as communication apps, social networking apps, calendar apps and collaborative apps. In some embodiments, the person-specific data also includes characteristics derived from one or more images of the person 1830, as discussed in relation to Figure 18C.
Figure 18B illustrates an example of social network connections 1814. In some embodiments, person-specific data for an identified person includes measures of social connectivity social connections to the requester (identified as the inquirer in Figure 18B) that are obtained from of a plurality of applications. The lines between the people in this figure represent one or more of their social connections (such as an email, instant message, and page connection.
social networking internet). In some modalities, the social distance between two people is used as a factor in determining a ranking score for potential image matches. For example, if a potential match image was an image of Person C and another potential match image was an image of Person Y, in some modalities, the potential match image of Person C will receive a higher social connectivity ranking factor (a be used in computing a ranking score) than Person Y, because, ignoring all other factors, the requester is more likely to have taken a picture of someone directly connected to the requester (Person C) than of someone three “hops” from distant social network (Person Y). Similarly, Person W will receive a higher social connectivity ranking factor than Person A, as Person W is two social network “hops” away from the requestor, while Person A is three social network “hops” away from the applicant. In some embodiments, the social network connections for a requester are also used to determine which image sources to look for in response to the requester's visual query. For example, in some modalities, images in accounts that belong to people with a direct social network connection are included in the image sources fetched for images that match a facial image in the visual query, while images in accounts that belong to people who do not have a direct social network connection to the requestor are not included in the image sources searched for images that match a facial image in the visual query.
For some visual queries, other information from the person-specific database 964 of Figure 18A is used in conjunction with distance or "hops" in a social network connections graph of Figure 18B. For example, if the applicant and the respective person live close to each other, if they work in the same industry, are in the same social network “groups” and if they both have mobile devices that are currently in the same location (measured, for example, by GPS receivers on their mobile devices), the respective person's ranking score may still be high, even though that respective person is several “hops” away from the requester in a social network connections graph. In another example, if the respective person in a potential match image is only one “hop” away from the requester in a social network connections graph, this respective person may rank high, even despite a weak connection determined through the base. of person-specific data data 964 (such as both people being members of a large membership group, such as sharing a religion or political party).
In some embodiments, the requester may identify certain information from the person-specific database 964 as more important than other information from the person-specific database 964. For example, the requester may specify that the information is related to the industry in which a person works is given greater weight than other person-specific data because the applicant is serving a job-related role and thus query images are likely to include facial images of other people working in the same industry as the applicant. In another example, the applicant may specify that age-related information be given greater weight than other person-specific data, because the applicant is submitting query images from a party (or other function) attended by people who are, all, or, primarily, of the same age.
Figure 18C is a block diagram illustrating some features derived from image 1830, which are derived from images of each person associated with the requester. In some embodiments, these derived characteristics (derived from at least one image of the person) are stored by the person identifier in a database. These derived characteristics include one or more of (and typically two or more of): an indoor habitat factor 1832, an outdoor habitat factor 1834, a sex factor 1836, a race factor 1838, a glasses factor 1840, a facial hair factor 1842, a hair factor 1844, a headdress factor 1846, a clothing factor 1847, an eye color factor 1848 as well as occurrence information regarding a quantity of occurrences of the respective person in the one or more 1850 image sources, and information regarding a number of co-occurrences of the respective person and with several additional persons in images from the one or more 1852 image sources. In some embodiments, the derived characteristics they also include image metadata information, such as 1854 date information, 1856 time information, and 1858 location information for each image. Each 1830 derived feature, derived from other images of a respective person, is given a value and a weight which is used in determining the ranking score for a respective person when this derived feature is used.
The exposed description, for the purpose of explanation, was described in relation to specific modalities. However, the illustrative discussions set forth are not intended to be complete or to limit the invention in the precise ways disclosed. Many modifications and variations are possible in view of the exposed precepts. Embodiments have been chosen and described in order to further explain the principles of the invention and its practical applications to thereby enable others skilled in the art to better utilize the invention and various embodiments with various modifications suitable for the particular use contemplated.

权利要求:
Claims (14)
[0001]
1. Computer implemented method of processing a visual query that includes a facial image performed on a server system with one or more processors and memory that stores one or more instructions for execution by the one or more processors to perform the method, the method comprising :receive, from a requester, a visual query comprising one or more facial images that include a respective facial image of the one or more facial images; identify image matches that match the respective facial image according to visual similarity criteria, image matches comprising images from one or more image sources identified according to data in relation to the requester; the method being characterized in that it further comprises: identifying one or more people associated with the image matches; obtaining first information representing a requester's current location; get second information representing a person's last known transient location associated with a potential image match in the potential image matches; and identify the person based at least in part on (i) the applicant's current location, and (ii) the person's last known transient location; retrieve, for each person identified, person-specific data comprising social connectedness measures of connectivity social with the requester obtained from a plurality of applications selected from the group consisting of communication applications, social network applications, calendar applications and collaborative applications; obtain current location information for a respective identified person; determine a score of ranking for the respective identified person, where one factor in determining the ranking score for the respective identified person is whether the current location information for the applicant matches the current location information for the respective identified person; generate an ordered list of people by ranking the one or more people identified in action. rdo with one or more measures of visual similarity between the respective facial image and the image matches and according to ranking information that comprises at least the social connection measures and the determined ranking score; and send the requester at least one identifier of person from the ordered list.
[0002]
2. Computer-implemented method, according to claim 1, characterized by the fact that the plurality of applications includes one or more communication applications selected from the group consisting of: social network applications, social microblogs, mail applications electronic and instant messaging applications; wherein the method further comprises: determining a communication amount between a respective identified person and the requester in the one or more communication applications; and determine a ranking score for the respective identified person, where one factor in determining the ranking score for the respective identified person is the amount of communication determined between the respective identified person and the requester in the one or more communication applications.
[0003]
3. Method implemented by computer, according to claim 1, characterized in that it further comprises: determining whether the applicant and a respective identified person are directly connected in a respective social network application, and determining a ranking score for the respective identified person, where a factor in determining the ranking score for the respective identified person is the determination of whether the requester and the respective identified person are directly connected in a respective social network application.
[0004]
4. Computer-implemented method according to any one of claims 1 to 3, characterized in that the person-specific data for a respective identified person further comprises characteristics derived from other images of the respective identified person; and wherein the ranking is additionally according to the similarity between the received query and the characteristics derived from other images of the respective identified person.
[0005]
5. Computer-implemented method, according to claim 4, characterized in that the characteristics derived from other images of the respective identified person include information on the date and time of the image capture; and wherein the method further comprises: determining one or more similarity measures according to a degree to which the received query has image capture date and time information similar to the date and time information of the other images of the respective identified person; and determine a ranking score for the respective identified person, where one or more factors in determining the ranking score for the respective person are the one or more similarity measures.
[0006]
6. Computer-implemented method, according to claim 4 or 5, characterized in that the characteristics derived from other images of the respective identified person include occurrence information in relation to a number of occurrences of the respective person identified in the images from one or more image sources; ewhere a factor in determining the ranking score for the respective identified person is the occurrence information for the respective identified person.
[0007]
7. Computer-implemented method according to any one of claims 1 to 6, characterized in that the visual consultation comprises a plurality of facial images that include at least the respective facial image and a second facial image; wherein the method further comprises: identifying second image matches that match the second facial image according to visual similarity criteria, the second image matches comprising images from the one or more image sources identified according to data relating to the requester; person associated with the second facial image matches; retrieving, for each identified person, person-specific data comprising second social connectedness measures of social connectivity to the second person obtained from the plurality of applications; and generate the ordered list of people by ranking the one or more additionally identified people according to ranking information comprising at least the second measures of social connection.
[0008]
8. Computer-implemented method according to claim 7, characterized in that the person-specific data for a respective identified person additionally comprise characteristics derived from other images of the respective identified person, characteristics derived from others images of the respective identified person including information regarding a number of co-occurrences of the respective identified person and the second person in images from the one or more image sources; and wherein the method further comprises: determining a ranking score for the respective identified person, where a factor in determining the ranking score for the respective identified person is the amount of co-occurrences of the person and the second person in images from the one or more image sources.
[0009]
9. Computer-implemented method, according to any one of claims 1 to 8, characterized in that it further comprises: before sending the requester at least one person identifier, and determining that the person identifier is releasable to the applicant by checking a membership list.
[0010]
10. Computer-implemented method according to any one of claims 1, 4, 6, 7, or 9, characterized in that, for a respective identified person, the person-specific data additionally comprises one or more of: name , address, occupation, group memberships, interests, age, hometown, personal statistics and work information for the respective identified person; and wherein the method further comprises: retrieving person-specific data for the applicant which also includes one or more of: name, address, occupation, group memberships, interests, age, hometown, personal statistics and work information for the respective person identified; determine one or more measures of personal similarity according to the degree to which the person-specific data of the applicant is similar to the person-specific data of the respective identified person; and determine a ranking score for the respective named person, where one or more factors in determining the ranking score for the respective named person are the one or more measures of personal similarity.
[0011]
11. Server system, for processing a visual query that includes a facial image, comprising: one or more processors for executing instructions; memory that stores one or more instructions to be executed by the one or more processors; the one or more instructions comprising instructions for :receive, from a requester, a visual inquiry comprising one or more facial images including a respective facial image of the one or more facial images; identify image matches that match the respective facial image according to visual similarity criteria, image matches comprising images from one or more image sources identified according to data in relation to the requester; the system being characterized by the fact that the instructions further comprise instructions to: identify one or more persons associated with the potential image matches; obtain first information representing a requester's current location; obtain second information representing a last known transient location of a person associated with a potential image match in the potential image matches; and identify the person based at least in part on (i) the applicant's current location, and (ii) the person's last known transient location; retrieve, for each person identified, person-specific data comprising social connectedness measures of connectivity social with the requester obtained from a plurality of applications selected from the group consisting of communication applications, social network applications, calendar applications and collaborative applications; obtain current location information for a respective identified person; determine a score of ranking for the respective identified person, where one factor in determining the ranking score for the respective identified person is whether the current location information for the applicant matches the current location information for the respective identified person; generate an ordered list of people by ranking the one or more people identified according to one or more measures of visual similarity between the respective facial image and the image matches and according to ranking information that comprises at least the measures of social connection and the ranking score determined; and send the requester at least one person identifier from the ordered list.
[0012]
12. Server system, according to claim 11, characterized in that it further comprises instructions to implement the method as defined in any one of claims 2 to 10.
[0013]
13. Computer-readable, non-transient storage media that stores one or more instructions configured for execution by a computer, wherein the one or more instructions comprise instructions to: receive, from a requestor, a visual query comprising one or more facial images that include a respective facial image from the one or more facial images; identifying image matches that match the respective facial image according to visual similarity criteria, image matches comprising images from the one or more image sources identified accordingly with data in relation to the requester; characterized by the fact that the instructions further comprise instructions to: identify one or more people associated with the potential image matches; obtain first information representing a current location of the requester; obtain second information representing a last transient location acquaintance of a person associated with a potential image match in the potential image matches; and identify the person based at least in part on (i) the applicant's current location, and (ii) the person's last known transient location; retrieve, for each person identified, person-specific data comprising social connectedness measures of connectivity social with the requester obtained from a plurality of applications selected from the group consisting of communication applications, social network applications, calendar applications and collaborative applications; obtain current location information for a respective identified person; determine a score of ranking for the respective identified person, where one factor in determining the ranking score for the respective identified person is whether the current location information for the applicant matches the current location information for the respective identified person; generate an ordered list of people by ranking the one or more people identified by action. rdo with one or more measures of visual similarity between the respective facial image and the image matches and according to ranking information that comprises at least the social connection measures and the determined ranking score; and send the requester at least one identifier of person from the ordered list.
[0014]
14. Computer-readable, non-transient storage media according to claim 13, further comprising instructions for implementing the method as defined in any one of claims 2 to 10.

类似技术:

公开号 | 公开日 | 专利标题

US10515114B2|2019-12-24|Facial recognition with social network aiding

US10534808B2|2020-01-14|Architecture for responding to visual query

CA2770186A1|2011-02-10|User interface for presenting search results for multiple regions of a visual query

AU2016201546B2|2016-10-20|Facial recognition with social network aiding

AU2016200659B2|2017-06-22|Architecture for responding to a visual query

AU2013245488B2|2016-04-21|Facial recognition with social network aiding

同族专利:

公开号 | 公开日

CN104021150B|2017-12-19|

CN104021150A|2014-09-03|

AU2010279248B2|2013-07-25|

AU2010279248A1|2012-03-15|

US20140172881A1|2014-06-19|

US9208177B2|2015-12-08|

US8670597B2|2014-03-11|

US10031927B2|2018-07-24|

US20180322147A1|2018-11-08|

CA2770239A1|2011-02-10|

JP2014194810A|2014-10-09|

KR20160108832A|2016-09-20|

JP5557911B2|2014-07-23|

JP5985535B2|2016-09-06|

KR20160108833A|2016-09-20|

JP2019023923A|2019-02-14|

KR101686613B1|2016-12-14|

JP6470713B2|2019-02-13|

JP2013501978A|2013-01-17|

KR101760853B1|2017-07-24|

US20110038512A1|2011-02-17|

JP2016201135A|2016-12-01|

KR101760855B1|2017-07-24|

US10515114B2|2019-12-24|

WO2011017653A1|2011-02-10|

KR20120058539A|2012-06-07|

CA2770239C|2019-01-22|

CN102667763A|2012-09-12|

US20160055182A1|2016-02-25|

EP2462522A1|2012-06-13|

引用文献:

公开号 | 申请日 | 公开日 | 申请人 | 专利标题

JP2813728B2|1993-11-01|1998-10-22|インターナショナル・ビジネス・マシーンズ・コーポレイション|Personal communication device with zoom / pan function|

US5764799A|1995-06-26|1998-06-09|Research Foundation Of State Of State Of New York|OCR method and apparatus using image equivalents|

US5987448A|1997-07-25|1999-11-16|Claritech Corporation|Methodology for displaying search results using character recognition|

US6269188B1|1998-03-12|2001-07-31|Canon Kabushiki Kaisha|Word grouping accuracy value generation|

US6137907A|1998-09-23|2000-10-24|Xerox Corporation|Method and apparatus for pixel-level override of halftone detection within classification blocks to reduce rectangular artifacts|

GB9903451D0|1999-02-16|1999-04-07|Hewlett Packard Co|Similarity searching for documents|

US6408293B1|1999-06-09|2002-06-18|International Business Machines Corporation|Interactive framework for understanding user's perception of multimedia data|

WO2002017166A2|2000-08-24|2002-02-28|Olive Software Inc.|System and method for automatic preparation and searching of scanned documents|

US7925967B2|2000-11-21|2011-04-12|Aol Inc.|Metadata quality improvement|

JP2002189724A|2000-12-20|2002-07-05|Victor Co Of Japan Ltd|Image data retrieval device|

US6748398B2|2001-03-30|2004-06-08|Microsoft Corporation|Relevance maximizing, iteration minimizing, relevance-feedback, content-based image retrieval |

US7313617B2|2001-09-28|2007-12-25|Dale Malik|Methods and systems for a communications and information resource manager|

DE10245900A1|2002-09-30|2004-04-08|Neven jun., Hartmut, Prof.Dr.|Image based query system for search engines or databases of mobile telephone, portable computer uses image recognition to access more information about objects in image|

US7298931B2|2002-10-14|2007-11-20|Samsung Electronics Co., Ltd.|Image retrieval method and apparatus using iterative matching|

US7472110B2|2003-01-29|2008-12-30|Microsoft Corporation|System and method for employing social networks for information discovery|

US7370034B2|2003-10-15|2008-05-06|Xerox Corporation|System and method for performing electronic information retrieval using keywords|

US20050083413A1|2003-10-20|2005-04-21|Logicalis|Method, system, apparatus, and machine-readable medium for use in connection with a server that uses images or audio for initiating remote function calls|

US7415456B2|2003-10-30|2008-08-19|Lucent Technologies Inc.|Network support for caller identification based on biometric measurement|

US7872669B2|2004-01-22|2011-01-18|Massachusetts Institute Of Technology|Photo-based mobile deixis system and related techniques|

US7707039B2|2004-02-15|2010-04-27|Exbiblio B.V.|Automatic modification of web pages|

WO2005114476A1|2004-05-13|2005-12-01|Nevengineering, Inc.|Mobile image-based information retrieval system|

US20060020630A1|2004-07-23|2006-01-26|Stager Reed R|Facial database methods and systems|

US7890871B2|2004-08-26|2011-02-15|Redlands Technology, Llc|System and method for dynamically generating, maintaining, and growing an online social network|

JP2006079460A|2004-09-10|2006-03-23|Fuji Photo Film Co Ltd|System, method and program for displaying electronic album and device, method, and program for classifying image|

US8489583B2|2004-10-01|2013-07-16|Ricoh Company, Ltd.|Techniques for retrieving documents using an image capture device|

US9176984B2|2006-07-31|2015-11-03|Ricoh Co., Ltd|Mixed media reality retrieval of differentially-weighted links|

US8320641B2|2004-10-28|2012-11-27|DigitalOptics Corporation Europe Limited|Method and apparatus for red-eye detection using preview or other reference images|

WO2006070047A1|2004-12-31|2006-07-06|Nokia Corporation|Provision of target specific information|

US20060150119A1|2004-12-31|2006-07-06|France Telecom|Method for interacting with automated information agents using conversational queries|

JP4267584B2|2005-02-28|2009-05-27|株式会社東芝|Device control apparatus and method|

JP4739062B2|2005-02-28|2011-08-03|富士フイルム株式会社|Image output apparatus, image output method, and program|

US7765231B2|2005-04-08|2010-07-27|Rathus Spencer A|System and method for accessing electronic data via an image search engine|

US7956669B2|2005-04-15|2011-06-07|International Business Machines Corporation|High-density low-power data retention power gating with double-gate devices|

US7773822B2|2005-05-02|2010-08-10|Colormax, Inc.|Apparatus and methods for management of electronic images|

US7760917B2|2005-05-09|2010-07-20|Like.Com|Computer-implemented method for performing similarity searches|

US7809722B2|2005-05-09|2010-10-05|Like.Com|System and method for enabling search and retrieval from image files based on recognized information|

JP2007026419A|2005-06-17|2007-02-01|Hitachi Ltd|Method for managing social network information and system therefor|

KR100754656B1|2005-06-20|2007-09-03|삼성전자주식회사|Method and system for providing user with image related information and mobile communication system|

US8095551B2|2005-08-18|2012-01-10|Microsoft Corporation|Annotating shared contacts with public descriptors|

US20090060289A1|2005-09-28|2009-03-05|Alex Shah|Digital Image Search System And Method|

US7450740B2|2005-09-28|2008-11-11|Facedouble, Inc.|Image classification and information retrieval over wireless digital networks and the internet|

US7876978B2|2005-10-13|2011-01-25|Penthera Technologies, Inc.|Regions of interest in video frames|

US20070098303A1|2005-10-31|2007-05-03|Eastman Kodak Company|Determining a particular person from a collection|

US8849821B2|2005-11-04|2014-09-30|Nokia Corporation|Scalable visual search system simplifying access to network and device functionality|

US7826665B2|2005-12-12|2010-11-02|Xerox Corporation|Personal information retrieval using knowledge bases for optical character recognition correction|

US7725477B2|2005-12-19|2010-05-25|Microsoft Corporation|Power filter for online listing service|

US8874591B2|2006-01-31|2014-10-28|Microsoft Corporation|Using user feedback to improve search results|

US9336333B2|2006-02-13|2016-05-10|Linkedin Corporation|Searching and reference checking within social networks|

US20070245045A1|2006-03-27|2007-10-18|Sidney Wu|Wireless data transceiver|

US7668405B2|2006-04-07|2010-02-23|Eastman Kodak Company|Forming connections between image collections|

JP2007316939A|2006-05-25|2007-12-06|Fujifilm Corp|Electronic album providing device and image network system|

US7917514B2|2006-06-28|2011-03-29|Microsoft Corporation|Visual and multi-dimensional search|

JP4891691B2|2006-07-31|2012-03-07|ヤフー株式会社|Method and system for retrieving data with location information added|

US20080031506A1|2006-08-07|2008-02-07|Anuradha Agatheeswaran|Texture analysis for mammography computer aided diagnosis|

US7934156B2|2006-09-06|2011-04-26|Apple Inc.|Deletion gestures on a portable multifunction device|

US8599251B2|2006-09-14|2013-12-03|Olympus Imaging Corp.|Camera|

JP4914778B2|2006-09-14|2012-04-11|オリンパスイメージング株式会社|camera|

JP2008165701A|2007-01-05|2008-07-17|Seiko Epson Corp|Image processing device, electronics equipment, image processing method, and program|

KR100865973B1|2007-02-08|2008-10-30|올라웍스|Method for searching certain person and method and system for generating copyright report for the certain person|

US8861898B2|2007-03-16|2014-10-14|Sony Corporation|Content image search|

KR100768127B1|2007-04-10|2007-10-17|올라웍스|Method for inferring personal relations by using readable data and method and system for tagging person identification information to digital data by using readable data|

CN101286092A|2007-04-11|2008-10-15|谷歌股份有限公司|Input method editor having a secondary language mode|

US20080267504A1|2007-04-24|2008-10-30|Nokia Corporation|Method, device and computer program product for integrating code-based and optical character recognition technologies into a mobile visual search|

US9275118B2|2007-07-25|2016-03-01|Yahoo! Inc.|Method and system for collecting and presenting historical communication data|

JP5128880B2|2007-08-30|2013-01-23|オリンパスイメージング株式会社|Image handling device|

KR101435140B1|2007-10-16|2014-09-02|삼성전자 주식회사|Display apparatus and method|

JP5459527B2|2007-10-29|2014-04-02|株式会社Ｊｖｃケンウッド|Image processing apparatus and method|

GB2454213A|2007-10-31|2009-05-06|Sony Corp|Analyzing a Plurality of Stored Images to Allow Searching|

US20090132264A1|2007-11-16|2009-05-21|Wood Mark D|Media asset evaluation based on social relationships|

US9237213B2|2007-11-20|2016-01-12|Yellowpages.Com Llc|Methods and apparatuses to initiate telephone connections|

KR100969298B1|2007-12-31|2010-07-09|인하대학교 산학협력단|Method For Social Network Analysis Based On Face Recognition In An Image or Image Sequences|

US20090237546A1|2008-03-24|2009-09-24|Sony Ericsson Mobile Communications Ab|Mobile Device with Image Recognition Processing Capability|

US8190604B2|2008-04-03|2012-05-29|Microsoft Corporation|User intention modeling for interactive image retrieval|

JP4939480B2|2008-05-19|2012-05-23|富士フイルム株式会社|Display device, imaging device, image search device, and program|

JP5109836B2|2008-07-01|2012-12-26|株式会社ニコン|Imaging device|

US8520979B2|2008-08-19|2013-08-27|Digimarc Corporation|Methods and systems for content processing|

US8670597B2|2009-08-07|2014-03-11|Google Inc.|Facial recognition with social network aiding|WO2007062254A2|2005-11-28|2007-05-31|Commvault Systems, Inc.|Systems and methods for data management|

WO2008091693A2|2007-01-23|2008-07-31|Jostens, Inc.|Method and system for creating customized output|

KR101686830B1|2013-05-30|2016-12-15|페이스북, 인크.|Tag suggestions for images on online social networks|

US9143573B2|2008-03-20|2015-09-22|Facebook, Inc.|Tag suggestions for images on online social networks|

US8666198B2|2008-03-20|2014-03-04|Facebook, Inc.|Relationship mapping employing multi-dimensional context including facial recognition|

JP2011524596A|2008-06-17|2011-09-01|ジョステンス，インコーポレイテッド|System and method for creating an yearbook|

US8457366B2|2008-12-12|2013-06-04|At&T Intellectual Property I, L.P.|System and method for matching faces|

US10706601B2|2009-02-17|2020-07-07|Ikorongo Technology, LLC|Interface for receiving subject affinity information|

US9727312B1|2009-02-17|2017-08-08|Ikorongo Technology, LLC|Providing subject information regarding upcoming images on a display|

US9210313B1|2009-02-17|2015-12-08|Ikorongo Technology, LLC|Display device content selection through viewer identification and affinity prediction|

US8670597B2|2009-08-07|2014-03-11|Google Inc.|Facial recognition with social network aiding|

US9135277B2|2009-08-07|2015-09-15|Google Inc.|Architecture for responding to a visual query|

JP5436104B2|2009-09-04|2014-03-05|キヤノン株式会社|Image search apparatus and image search method|

US20110099199A1|2009-10-27|2011-04-28|Thijs Stalenhoef|Method and System of Detecting Events in Image Collections|

US8121618B2|2009-10-28|2012-02-21|Digimarc Corporation|Intuitive computing methods and systems|

US20110119297A1|2009-11-18|2011-05-19|Robert Rango|System and method for providing a personal characteristic-based contact list|

US9176986B2|2009-12-02|2015-11-03|Google Inc.|Generating a combination of a visual query and matching canonical document|

US9405772B2|2009-12-02|2016-08-02|Google Inc.|Actionable search results for street view visual queries|

US9852156B2|2009-12-03|2017-12-26|Google Inc.|Hybrid use of location sensor data and visual query to return local listings for visual query|

US8526684B2|2009-12-14|2013-09-03|Microsoft Corporation|Flexible image comparison and face matching application|

US8644563B2|2009-12-14|2014-02-04|Microsoft Corporation|Recognition of faces using prior behavior|

US9197736B2|2009-12-31|2015-11-24|Digimarc Corporation|Intuitive computing methods and systems|

CN102782733B|2009-12-31|2015-11-25|数字标记公司|Adopt the method and the allocation plan that are equipped with the smart phone of sensor|

US8698762B2|2010-01-06|2014-04-15|Apple Inc.|Device, method, and graphical user interface for navigating and displaying content in context|

US9229227B2|2010-02-28|2016-01-05|Microsoft Technology Licensing, Llc|See-through near-eye display glasses with a light transmissive wedge shaped illumination system|

US9223134B2|2010-02-28|2015-12-29|Microsoft Technology Licensing, Llc|Optical imperfections in a light transmissive illumination system for see-through near-eye display glasses|

AU2011220382A1|2010-02-28|2012-10-18|Microsoft Corporation|Local advertising content on an interactive head-mounted eyepiece|

US9134534B2|2010-02-28|2015-09-15|Microsoft Technology Licensing, Llc|See-through near-eye display glasses including a modular image source|

US9285589B2|2010-02-28|2016-03-15|Microsoft Technology Licensing, Llc|AR glasses with event and sensor triggered control of AR eyepiece applications|

US9091851B2|2010-02-28|2015-07-28|Microsoft Technology Licensing, Llc|Light control in head mounted displays|

US10180572B2|2010-02-28|2019-01-15|Microsoft Technology Licensing, Llc|AR glasses with event and user action control of external applications|

US9097891B2|2010-02-28|2015-08-04|Microsoft Technology Licensing, Llc|See-through near-eye display glasses including an auto-brightness control for the display brightness based on the brightness in the environment|

US9129295B2|2010-02-28|2015-09-08|Microsoft Technology Licensing, Llc|See-through near-eye display glasses with a fast response photochromic film system for quick transition from dark to clear|

US9128281B2|2010-09-14|2015-09-08|Microsoft Technology Licensing, Llc|Eyepiece with uniformly illuminated reflective display|

US9366862B2|2010-02-28|2016-06-14|Microsoft Technology Licensing, Llc|System and method for delivering content to a group of see-through near eye display eyepieces|

US20120206335A1|2010-02-28|2012-08-16|Osterhout Group, Inc.|Ar glasses with event, sensor, and user action based direct control of external devices with feedback|

US9097890B2|2010-02-28|2015-08-04|Microsoft Technology Licensing, Llc|Grating in a light transmissive illumination system for see-through near-eye display glasses|

US20120249797A1|2010-02-28|2012-10-04|Osterhout Group, Inc.|Head-worn adaptive display|

US9759917B2|2010-02-28|2017-09-12|Microsoft Technology Licensing, Llc|AR glasses with event and sensor triggered AR eyepiece interface to external devices|

US9182596B2|2010-02-28|2015-11-10|Microsoft Technology Licensing, Llc|See-through near-eye display glasses with the optical assembly including absorptive polarizers or anti-reflective coatings to reduce stray light|

US20150309316A1|2011-04-06|2015-10-29|Microsoft Technology Licensing, Llc|Ar glasses with predictive control of external device based on event input|

US9341843B2|2010-02-28|2016-05-17|Microsoft Technology Licensing, Llc|See-through near-eye display glasses with a small scale image source|

US8903798B2|2010-05-28|2014-12-02|Microsoft Corporation|Real-time annotation and enrichment of captured video|

KR101317401B1|2010-08-25|2013-10-10|주식회사 팬택|Terminal device and method for object storing|

KR20120021057A|2010-08-31|2012-03-08|삼성전자주식회사|Method for providing search service to extract keywords in specific region and display apparatus applying the same|

KR20120021061A|2010-08-31|2012-03-08|삼성전자주식회사|Method for providing search service to extract keywords in specific region and display apparatus applying the same|

US8630494B1|2010-09-01|2014-01-14|Ikorongo Technology, LLC|Method and system for sharing image content based on collection proximity|

US8824748B2|2010-09-24|2014-09-02|Facebook, Inc.|Auto tagging in geo-social networking system|

WO2012050251A1|2010-10-14|2012-04-19|엘지전자 주식회사|Mobile terminal and method for controlling same|

US8819172B2|2010-11-04|2014-08-26|Digimarc Corporation|Smartphone-based methods and systems|

US20120114199A1|2010-11-05|2012-05-10|Myspace, Inc.|Image auto tagging method and application|

US8559682B2|2010-11-09|2013-10-15|Microsoft Corporation|Building a person profile database|

KR101429962B1|2010-11-22|2014-08-14|한국전자통신연구원|System and method for processing data for recalling memory|

US9984157B2|2010-12-01|2018-05-29|Aware Inc.|Relationship detection within biometric match results candidates|

US8526686B2|2010-12-24|2013-09-03|Telefonaktiebolaget L M Ericsson |Dynamic profile creation in response to facial recognition|

CA2826177C|2011-02-03|2017-08-08|Facebook, Inc.|Systems and methods for image-to-text and text-to-image association|

US8606776B2|2011-02-18|2013-12-10|Google Inc.|Affinity based ranked for search and display|

US9483751B2|2011-02-18|2016-11-01|Google Inc.|Label privileges|

US20120213404A1|2011-02-18|2012-08-23|Google Inc.|Automatic event recognition and cross-user photo clustering|

US9251854B2|2011-02-18|2016-02-02|Google Inc.|Facial detection, recognition and bookmarking in videos|

US9135500B2|2011-02-18|2015-09-15|Google Inc.|Facial recognition|

US9317530B2|2011-03-29|2016-04-19|Facebook, Inc.|Face recognition based on spatial and temporal proximity|

RU2011115292A|2011-04-18|2012-10-27|Валерий Леонидович Сериков |METHOD FOR COMMUNICATION FOR THE PURPOSE OF CARRYING OUT ELECTORAL ACQUAINTANCE|

US20120278176A1|2011-04-27|2012-11-01|Amir Naor|Systems and methods utilizing facial recognition and social network information associated with potential customers|

US8631084B2|2011-04-29|2014-01-14|Facebook, Inc.|Dynamic tagging recommendation|

US9678992B2|2011-05-18|2017-06-13|Microsoft Technology Licensing, Llc|Text to image translation|

US8818049B2|2011-05-18|2014-08-26|Google Inc.|Retrieving contact information based on image recognition searches|

US8891832B2|2011-06-03|2014-11-18|Facebook, Inc.|Computer-vision-assisted location check-in|

US8755610B2|2011-06-10|2014-06-17|Apple Inc.|Auto-recognition for noteworthy objects|

US8935259B2|2011-06-20|2015-01-13|Google Inc|Text suggestions for images|

WO2012176317A1|2011-06-23|2012-12-27|サイバーアイ・エンタテインメント株式会社|Image recognition system-equipped interest graph collection system using relationship search|

US9159324B2|2011-07-01|2015-10-13|Qualcomm Incorporated|Identifying people that are proximate to a mobile device user via social graphs, speech models, and user context|

US9143889B2|2011-07-05|2015-09-22|Htc Corporation|Method of establishing application-related communication between mobile electronic devices, mobile electronic device, non-transitory machine readable media thereof, and media sharing method|

US9330298B2|2011-07-07|2016-05-03|Kao Corporation|Face impression analyzing method, aesthetic counseling method, and face image generating method|

US8725796B2|2011-07-07|2014-05-13|F. David Serena|Relationship networks having link quality metrics with inference and concomitant digital value exchange|

US9195679B1|2011-08-11|2015-11-24|Ikorongo Technology, LLC|Method and system for the contextual display of image tags in a social network|

US20130054631A1|2011-08-30|2013-02-28|Microsoft Corporation|Adding social network data to search suggestions|

US8533204B2|2011-09-02|2013-09-10|Xerox Corporation|Text-based searching of image data|

US10074113B2|2011-09-07|2018-09-11|Elwha Llc|Computational systems and methods for disambiguating search terms corresponding to network members|

US8953889B1|2011-09-14|2015-02-10|Rawles Llc|Object datastore in an augmented reality environment|

US9002937B2|2011-09-28|2015-04-07|Elwha Llc|Multi-party multi-modality communication|

US9762524B2|2011-09-28|2017-09-12|Elwha Llc|Multi-modality communication participation|

US9699632B2|2011-09-28|2017-07-04|Elwha Llc|Multi-modality communication with interceptive conversion|

US9477943B2|2011-09-28|2016-10-25|Elwha Llc|Multi-modality communication|

US9906927B2|2011-09-28|2018-02-27|Elwha Llc|Multi-modality communication initiation|

US9788349B2|2011-09-28|2017-10-10|Elwha Llc|Multi-modality communication auto-activation|

US9503550B2|2011-09-28|2016-11-22|Elwha Llc|Multi-modality communication modification|

US9165017B2|2011-09-29|2015-10-20|Google Inc.|Retrieving images|

US8885960B2|2011-10-05|2014-11-11|Microsoft Corporation|Linking photographs via face, time, and location|

US8782042B1|2011-10-14|2014-07-15|Firstrain, Inc.|Method and system for identifying entities|

CN102624534A|2011-10-18|2012-08-01|北京小米科技有限责任公司|Method for creating group|

US20130109302A1|2011-10-31|2013-05-02|Royce A. Levien|Multi-modality communication with conversion offloading|

WO2013067513A1|2011-11-04|2013-05-10|Massachusetts Eye & Ear Infirmary|Contextual image stabilization|

US9087273B2|2011-11-15|2015-07-21|Facebook, Inc.|Facial recognition using social networking information|

US9280708B2|2011-11-30|2016-03-08|Nokia Technologies Oy|Method and apparatus for providing collaborative recognition using media segments|

US9355317B2|2011-12-14|2016-05-31|Nec Corporation|Video processing system, video processing method, video processing device for mobile terminal or server and control method and control program thereof|

US10115127B2|2011-12-16|2018-10-30|Nec Corporation|Information processing system, information processing method, communications terminals and control method and control program thereof|

US20130156274A1|2011-12-19|2013-06-20|Microsoft Corporation|Using photograph to initiate and perform action|

US9256620B2|2011-12-20|2016-02-09|Amazon Technologies, Inc.|Techniques for grouping images|

KR101633198B1|2011-12-28|2016-06-23|엠파이어 테크놀로지 디벨롭먼트 엘엘씨|Preventing classification of object contextual information|

US8924890B2|2012-01-10|2014-12-30|At&T Intellectual Property I, L.P.|Dynamic glyph-based search|

KR102007840B1|2012-04-13|2019-08-06|엘지전자 주식회사|A Method for Image Searching and a Digital Device Operating the Same|

US8422747B1|2012-04-16|2013-04-16|Google Inc.|Finding untagged images of a social network member|

US8925106B1|2012-04-20|2014-12-30|Google Inc.|System and method of ownership of an online collection|

US8666123B2|2012-04-26|2014-03-04|Google Inc.|Creating social network groups|

US9047376B2|2012-05-01|2015-06-02|Hulu, LLC|Augmenting video with facial recognition|

US20130294594A1|2012-05-04|2013-11-07|Steven Chervets|Automating the identification of meeting attendees|

US8897484B1|2012-05-18|2014-11-25|Google Inc.|Image theft detector|

US8892523B2|2012-06-08|2014-11-18|Commvault Systems, Inc.|Auto summarization of content|

US8861804B1|2012-06-15|2014-10-14|Shutterfly, Inc.|Assisted photo-tagging with facial recognition models|

US20140015967A1|2012-07-16|2014-01-16|Shaun Moore|Social intelligence, tracking and monitoring system and methods|

US9098584B1|2012-07-19|2015-08-04|Google Inc.|Image search privacy protection techniques|

US8868598B2|2012-08-15|2014-10-21|Microsoft Corporation|Smart user-centric information aggregation|

KR20140027826A|2012-08-27|2014-03-07|삼성전자주식회사|Apparatus and method for displaying a content in a portabel terminal|

US9471838B2|2012-09-05|2016-10-18|Motorola Solutions, Inc.|Method, apparatus and system for performing facial recognition|

AU2012101375A4|2012-09-06|2012-10-18|Oracle Recording's|FutureNetID|

US8873813B2|2012-09-17|2014-10-28|Z Advanced Computing, Inc.|Application of Z-webs and Z-factors to analytics, search engine, learning, recognition, natural language, and other utilities|

US9514536B2|2012-10-10|2016-12-06|Broadbandtv, Corp.|Intelligent video thumbnail selection and generation|

US20140108501A1|2012-10-17|2014-04-17|Matthew Nicholas Papakipos|Presence Granularity with Augmented Reality|

US10038885B2|2012-10-17|2018-07-31|Facebook, Inc.|Continuous capture with augmented reality|

US10032233B2|2012-10-17|2018-07-24|Facebook, Inc.|Social context in augmented reality|

US9177062B2|2012-10-31|2015-11-03|Google Inc.|Sorting social profile search results based on computing personal similarity scores|

US20140160157A1|2012-12-11|2014-06-12|Adam G. Poulos|People-triggered holographic reminders|

CN103076879A|2012-12-28|2013-05-01|中兴通讯股份有限公司|Multimedia interaction method and device based on face information, and terminal|

US8824751B2|2013-01-07|2014-09-02|MTN Satellite Communications|Digital photograph group editing and access|

JP2014164697A|2013-02-27|2014-09-08|Canon Inc|Image processing apparatus, image processing method, program, and storage medium|

US20140280267A1|2013-03-14|2014-09-18|Fotofad, Inc.|Creating real-time association interaction throughout digital media|

EP2973382B1|2013-03-15|2019-07-17|Socure Inc.|Risk assessment using social networking data|

US10296933B2|2013-04-12|2019-05-21|Facebook, Inc.|Identifying content in electronic images|

WO2014172827A1|2013-04-22|2014-10-30|Nokia Corporation|A method and apparatus for acquaintance management and privacy protection|

WO2014174547A1|2013-04-22|2014-10-30|富士通株式会社|System control method, portable information terminal control method, and server control method|

US9922052B1|2013-04-26|2018-03-20|A9.Com, Inc.|Custom image data store|

KR20140130331A|2013-04-30|2014-11-10|세이엔|Wearable electronic device and method for controlling the same|

US9646208B2|2013-05-07|2017-05-09|Htc Corporation|Method for computerized grouping contact list, electronic device using the same and computer program product|

US10176500B1|2013-05-29|2019-01-08|A9.Com, Inc.|Content classification based on data recognition|

US10645127B1|2013-05-30|2020-05-05|Jpmorgan Chase Bank, N.A.|System and method for virtual briefing books|

US9772176B2|2013-06-13|2017-09-26|Intuitive Surgical Operations, Inc.|Overlapped chirped fiber bragg grating sensing fiber and methods and apparatus for parameter measurement using same|

KR102099400B1|2013-06-20|2020-04-09|삼성전자주식회사|Apparatus and method for displaying an image in a portable terminal|

US20150006669A1|2013-07-01|2015-01-01|Google Inc.|Systems and methods for directing information flow|

US9798813B2|2013-07-31|2017-10-24|Salesforce.Com, Inc.|Extensible person container|

CN104346370B|2013-07-31|2018-10-23|阿里巴巴集团控股有限公司|Picture search, the method and device for obtaining image text information|

CN103347032A|2013-08-01|2013-10-09|赵频|Method and system for making friends|

US20160188633A1|2013-08-01|2016-06-30|National University Of Singapore|A method and apparatus for tracking microblog messages for relevancy to an entity identifiable by an associated text and an image|

US10152495B2|2013-08-19|2018-12-11|Qualcomm Incorporated|Visual search in real world using optical see-through head mounted display with augmented reality and user interaction tracking|

WO2015053604A1|2013-10-08|2015-04-16|Data Calibre Sdn Bhd|A face retrieval method|

US9531722B1|2013-10-31|2016-12-27|Google Inc.|Methods for generating an activity stream|

US9542457B1|2013-11-07|2017-01-10|Google Inc.|Methods for displaying object history information|

US9614880B1|2013-11-12|2017-04-04|Google Inc.|Methods for real-time notifications in an activity stream|

US20150131868A1|2013-11-14|2015-05-14|VISAGE The Global Pet Recognition Company Inc.|System and method for matching an animal to existing animal profiles|

CN106796709B|2013-12-20|2021-03-19|英特尔公司|Social circle and relationship identification|

US9972324B2|2014-01-10|2018-05-15|Verizon Patent And Licensing Inc.|Personal assistant application|

IN2014MU00227A|2014-01-22|2015-09-04|Reliance Jio Infocomm Ltd|

US9177194B2|2014-01-29|2015-11-03|Sony Corporation|System and method for visually distinguishing faces in a digital image|

US20150213010A1|2014-01-30|2015-07-30|Sage Microelectronics Corp.|Storage system with distributed data searching|

US9311639B2|2014-02-11|2016-04-12|Digimarc Corporation|Methods, apparatus and arrangements for device to device communication|

US10121060B2|2014-02-13|2018-11-06|Oath Inc.|Automatic group formation and group detection through media recognition|

US9509772B1|2014-02-13|2016-11-29|Google Inc.|Visualization and control of ongoing ingress actions|

US9710447B2|2014-03-17|2017-07-18|Yahoo! Inc.|Visual recognition using social links|

GB201406594D0|2014-04-11|2014-05-28|Idscan Biometric Ltd|Method, system and computer program for validating a facial image-bearing identity document|

US20150304436A1|2014-04-16|2015-10-22|Facebook, Inc.|Nearby Friend Notifications on Online Social Networks|

US9594946B2|2014-05-08|2017-03-14|Shutterfly, Inc.|Image product creation based on face images grouped using image product statistics|

US9519826B2|2014-05-08|2016-12-13|Shutterfly, Inc.|Automatic image product creation for user accounts comprising large number of images|

EP3413237A1|2015-11-04|2018-12-12|Shutterfly, Inc.|Automatic image product creation for user accounts comprising large number of images|

US9495617B2|2014-05-08|2016-11-15|Shutterfly, Inc.|Image product creation based on face images grouped using image product statistics|

US9280701B2|2014-05-08|2016-03-08|Shutterfly, Inc.|Grouping face images using statistic distribution estimate|

US20150356180A1|2014-06-04|2015-12-10|Facebook, Inc.|Inferring relationship statuses of users of a social networking system|

US9536199B1|2014-06-09|2017-01-03|Google Inc.|Recommendations based on device usage|

JP5664813B1|2014-06-10|2015-02-04|富士ゼロックス株式会社|Design management apparatus and program|

US9147117B1|2014-06-11|2015-09-29|Socure Inc.|Analyzing facial recognition data and social network data for user authentication|

US9507791B2|2014-06-12|2016-11-29|Google Inc.|Storage system user interface with floating file collection|

US10078781B2|2014-06-13|2018-09-18|Google Llc|Automatically organizing images|

US9830391B1|2014-06-24|2017-11-28|Google Inc.|Query modification based on non-textual resource context|

US9811592B1|2014-06-24|2017-11-07|Google Inc.|Query modification based on textual resource context|

US10049477B1|2014-06-27|2018-08-14|Google Llc|Computer-assisted text and visual styling for images|

CN104143213B|2014-07-16|2017-05-31|北京卫星制造厂|A kind of conduit automatic identifying method of view-based access control model detection|

US20160019284A1|2014-07-18|2016-01-21|Linkedln Corporation|Search engine using name clustering|

CN104091164A|2014-07-28|2014-10-08|北京奇虎科技有限公司|Face picture name recognition method and system|

US9251427B1|2014-08-12|2016-02-02|Microsoft Technology Licensing, Llc|False face representation identification|

GB201415938D0|2014-09-09|2014-10-22|Idscan Biometrics Ltd|Distributed Identity Validation Method System And Computer Program|

US10277588B2|2014-11-03|2019-04-30|Facebook, Inc.|Systems and methods for authenticating a user based on self-portrait media content|

US10104345B2|2014-12-16|2018-10-16|Sighthound, Inc.|Data-enhanced video viewing system and methods for computer vision processing|

CN104537341B|2014-12-23|2016-10-05|北京奇虎科技有限公司|Face picture information getting method and device|

US10489637B2|2014-12-23|2019-11-26|Beijing Qihoo Technology Company Limited|Method and device for obtaining similar face images and face image information|

US9870420B2|2015-01-19|2018-01-16|Google Llc|Classification and storage of documents|

US9953151B2|2015-02-03|2018-04-24|Chon Hock LEOW|System and method identifying a user to an associated device|

JP6589300B2|2015-03-09|2019-10-16|フリュー株式会社|Image generating apparatus and control method thereof|

WO2016154814A1|2015-03-27|2016-10-06|华为技术有限公司|Method and apparatus for displaying electronic picture, and mobile device|

US10445391B2|2015-03-27|2019-10-15|Jostens, Inc.|Yearbook publishing system|

GB2537139A|2015-04-08|2016-10-12|Edward Henderson Charles|System and method for processing and retrieving digital content|

CN106156144A|2015-04-13|2016-11-23|腾讯科技（深圳）有限公司|Information-pushing method and device|

US10691314B1|2015-05-05|2020-06-23|State Farm Mutual Automobile Insurance Company|Connecting users to entities based on recognized objects|

US9704020B2|2015-06-16|2017-07-11|Microsoft Technology Licensing, Llc|Automatic recognition of entities in media-captured events|

US9872061B2|2015-06-20|2018-01-16|Ikorongo Technology, LLC|System and device for interacting with a remote presentation|

US10628009B2|2015-06-26|2020-04-21|Rovi Guides, Inc.|Systems and methods for automatic formatting of images for media assets based on user profile|

KR20180021669A|2015-06-26|2018-03-05|로비 가이드스, 인크.|System and method for automatic formatting of images for media assets based on user profiles|

US9591359B2|2015-06-26|2017-03-07|Rovi Guides, Inc.|Systems and methods for automatic formatting of images for media assets based on prevalence|

US20160378308A1|2015-06-26|2016-12-29|Rovi Guides, Inc.|Systems and methods for identifying an optimal image for a media asset representation|

KR20170004450A|2015-07-02|2017-01-11|엘지전자 주식회사|Mobile terminal and method for controlling the same|

US10094655B2|2015-07-15|2018-10-09|15 Seconds of Fame, Inc.|Apparatus and methods for facial recognition and video analytics to identify individuals in contextual video streams|

US10154071B2|2015-07-29|2018-12-11|International Business Machines Corporation|Group chat with dynamic background images and content from social media|

CN105095873B|2015-07-31|2018-12-18|小米科技有限责任公司|Photo be shared method, apparatus|

CN105069083B|2015-07-31|2019-03-08|小米科技有限责任公司|The determination method and device of association user|

US10521099B2|2015-08-28|2019-12-31|Facebook, Inc.|Systems and methods for providing interactivity for panoramic media content|

US10521100B2|2015-08-28|2019-12-31|Facebook, Inc.|Systems and methods for providing interactivity for panoramic media content|

US20170060404A1|2015-08-28|2017-03-02|Facebook, Inc.|Systems and methods for providing interactivity for panoramic media content|

JP6850291B2|2015-10-21|2021-03-31|１５セカンズオブフェイム，インコーポレイテッド|Methods and devices for minimizing false positives in face recognition applications|

US9904872B2|2015-11-13|2018-02-27|Microsoft Technology Licensing, Llc|Visual representations of photo albums|

US10002313B2|2015-12-15|2018-06-19|Sighthound, Inc.|Deeply learned convolutional neural networksfor object localization and classification|

US10291610B2|2015-12-15|2019-05-14|Visa International Service Association|System and method for biometric authentication using social network|

CN105787023B|2016-02-24|2019-03-26|北京橙鑫数据科技有限公司|The dissemination method and device of multimedia file|

WO2017143575A1|2016-02-25|2017-08-31|华为技术有限公司|Method for retrieving content of image, portable electronic device, and graphical user interface|

US10306315B2|2016-03-29|2019-05-28|International Business Machines Corporation|Video streaming augmenting|

US10740385B1|2016-04-21|2020-08-11|Shutterstock, Inc.|Identifying visual portions of visual media files responsive to search queries|

US11003667B1|2016-05-27|2021-05-11|Google Llc|Contextual information for a displayed resource|

DK201670609A1|2016-06-12|2018-01-02|Apple Inc|User interfaces for retrieving contextually relevant media content|

US10318812B2|2016-06-21|2019-06-11|International Business Machines Corporation|Automatic digital image correlation and distribution|

US10152521B2|2016-06-22|2018-12-11|Google Llc|Resource recommendations for a displayed resource|

CN106096011A|2016-06-23|2016-11-09|北京小米移动软件有限公司|Method for picture sharing and device|

CN106096009A|2016-06-23|2016-11-09|北京小米移动软件有限公司|Method for generating message and device|

US10802671B2|2016-07-11|2020-10-13|Google Llc|Contextual information for a displayed resource that includes an image|

US10051108B2|2016-07-21|2018-08-14|Google Llc|Contextual information for a notification|

US10467300B1|2016-07-21|2019-11-05|Google Llc|Topical resource recommendations for a displayed resource|

US10489459B1|2016-07-21|2019-11-26|Google Llc|Query recommendations for a displayed resource|

US10169649B2|2016-07-28|2019-01-01|International Business Machines Corporation|Smart image filtering method with domain rules application|

US10212113B2|2016-09-19|2019-02-19|Google Llc|Uniform resource identifier and image sharing for contextual information display|

CN106446831B|2016-09-24|2021-06-25|江西欧迈斯微电子有限公司|Face recognition method and device|

US20180247310A1|2017-02-28|2018-08-30|Mastercard International Incorporated|System and method for validating a cashless transaction|

US10282598B2|2017-03-07|2019-05-07|Bank Of America Corporation|Performing image analysis for dynamic personnel identification based on a combination of biometric features|

US10311308B2|2017-03-31|2019-06-04|International Business Machines Corporation|Image processing to identify selected individuals in a field of view|

EP3410330B1|2017-05-31|2021-07-21|Mastercard International Incorporated|Improvements in biometric authentication|

US10679068B2|2017-06-13|2020-06-09|Google Llc|Media contextual information from buffered media data|

US20180365268A1|2017-06-15|2018-12-20|WindowLykr Inc.|Data structure, system and method for interactive media|

WO2019008553A1|2017-07-07|2019-01-10|Bhushan Fani|System and method for establishing a communication session|

US10607082B2|2017-09-09|2020-03-31|Google Llc|Systems, methods, and apparatus for image-responsive automated assistants|

US10824329B2|2017-09-25|2020-11-03|Motorola Solutions, Inc.|Methods and systems for displaying query status information on a graphical user interface|

US10776467B2|2017-09-27|2020-09-15|International Business Machines Corporation|Establishing personal identity using real time contextual data|

US10803297B2|2017-09-27|2020-10-13|International Business Machines Corporation|Determining quality of images for user identification|

US10839003B2|2017-09-27|2020-11-17|International Business Machines Corporation|Passively managed loyalty program using customer images and behaviors|

US10795979B2|2017-09-27|2020-10-06|International Business Machines Corporation|Establishing personal identity and user behavior based on identity patterns|

GB2581657A|2017-10-10|2020-08-26|Laurie Cal Llc|Online identity verification platform and process|

WO2019083508A1|2017-10-24|2019-05-02|Hewlett-Packard Development Company, L.P.|Facial recognitions based on contextual information|

US11176679B2|2017-10-24|2021-11-16|Hewlett-Packard Development Company, L.P.|Person segmentations for background replacements|

KR102061787B1|2017-11-29|2020-01-03|삼성전자주식회사|The Electronic Device Shooting Image and the Method for Displaying the Image|

US10565432B2|2017-11-29|2020-02-18|International Business Machines Corporation|Establishing personal identity based on multiple sub-optimal images|

US10387487B1|2018-01-25|2019-08-20|Ikorongo Technology, LLC|Determining images of interest based on a geographical location|

CN108270794B|2018-02-06|2020-10-09|腾讯科技（深圳）有限公司|Content distribution method, device and readable medium|

US10642886B2|2018-02-14|2020-05-05|Commvault Systems, Inc.|Targeted search of backup data using facial recognition|

US10810457B2|2018-05-09|2020-10-20|Fuji Xerox Co., Ltd.|System for searching documents and people based on detecting documents and people around a table|

US10511763B1|2018-06-19|2019-12-17|Microsoft Technology Licensing, Llc|Starting electronic communication based on captured image|

US10402553B1|2018-07-31|2019-09-03|Capital One Services, Llc|System and method for using images to authenticate a user|

KR102077093B1|2018-08-23|2020-02-13|엔에이치엔 주식회사|Device and method to share image received from user terminal with other user terminals|

US10936856B2|2018-08-31|2021-03-02|15 Seconds of Fame, Inc.|Methods and apparatus for reducing false positives in facial recognition|

US10891480B2|2018-09-27|2021-01-12|Ncr Corporation|Image zone processing|

CN109388722A|2018-09-30|2019-02-26|上海碳蓝网络科技有限公司|It is a kind of for adding or searching the method and apparatus of social connections people|

US10936178B2|2019-01-07|2021-03-02|MemoryWeb, LLC|Systems and methods for analyzing and organizing digital photos and videos|

US11010596B2|2019-03-07|2021-05-18|15 Seconds of Fame, Inc.|Apparatus and methods for facial recognition systems to identify proximity-based connections|

US11093715B2|2019-03-29|2021-08-17|Samsung Electronics Co., Ltd.|Method and system for learning and enabling commands via user demonstration|

DK201970535A1|2019-05-06|2020-12-21|Apple Inc|Media browsing user interface with intelligently selected representative media items|

US11108996B1|2020-07-28|2021-08-31|Bank Of America Corporation|Two-way intercept using coordinate tracking and video classification|

法律状态:
2018-01-09| B25D| Requested change of name of applicant approved|Owner name: GOOGLE LLC (US) |

2019-01-15| B06F| Objections, documents and/or translations needed after an examination request according [chapter 6.6 patent gazette]|

2019-07-23| B06U| Preliminary requirement: requests with searches performed by other patent offices: procedure suspended [chapter 6.21 patent gazette]|

2019-10-01| B06I| Publication of requirement cancelled [chapter 6.9 patent gazette]|Free format text: ANULADA A PUBLICACAO CODIGO 6.21 NA RPI NO 2533 DE 23/07/2019 POR TER SIDO INDEVIDA. |

2019-10-08| B06U| Preliminary requirement: requests with searches performed by other patent offices: procedure suspended [chapter 6.21 patent gazette]|

2020-12-08| B07A| Application suspended after technical examination (opinion) [chapter 7.1 patent gazette]|

2021-03-30| B09A| Decision: intention to grant [chapter 9.1 patent gazette]|

2021-06-22| B16A| Patent or certificate of addition of invention granted [chapter 16.1 patent gazette]|Free format text: PRAZO DE VALIDADE: 20 (VINTE) ANOS CONTADOS A PARTIR DE 06/08/2010, OBSERVADAS AS CONDICOES LEGAIS. PATENTE CONCEDIDA CONFORME ADI 5.529/DF, , QUE DETERMINA A ALTERACAO DO PRAZO DE CONCESSAO. |

优先权:

申请号 | 申请日 | 专利标题

US23239709P| true| 2009-08-07|2009-08-07|

US61/232,397|2009-08-07|

US37078410P| true| 2010-08-04|2010-08-04|

US61/370,784|2010-08-04|

US12/851,473|2010-08-05|

US12/851,473|US8670597B2|2009-08-07|2010-08-05|Facial recognition with social network aiding|

PCT/US2010/044771|WO2011017653A1|2009-08-07|2010-08-06|Facial recognition with social network aiding|

[返回顶部]